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13.  ABSTRACT 


Development  of  a  comprehensive  theory  of  mathematical  games  has  been 
hampered  by  philosophical,  conceptual,  and  practical  difficulties.  For  dynamic  games 
in  particular,  solution  methods  are  elusive,  and  algorithms  are  rare.  This  is 
especially  apparent  for  games  which  require  that  the  competitors  randomize,  or  mix, 
their  tactics  even  though  such  randomization  is  a  common  property  of  actual  competitive 
situations.  This  dissertation  is  therefore  concerned  with  the  development  of  a  tech¬ 
nique  for  the  synthesis  of  mixed  strategy  solutions  of  games, 

A  special  class  of  dynamic  games  is  studied:  two-person  zero-sum  noise-free 
multistage  games  of  fixed  duration  for  which  the  payoff  and  dynamic  functions  are 
multivariable  polynomii Is  and  the  control  vectors  are  elements  of  compact  hypercubes. 
The  problem  is  formulated  such  that  known  results  concerning  existence  of  saddlepoint 
solutions  are  applicable;  emphasis  is  on  the  determination  of  the  value  and  of  the 
optimal  mixed  strategies  and  on  the  properties  of  the  solution  functions.  This  is 
achieved  by  extending  and  applying  the  method  of  dual  cones  such  chat  the  game  becomes 
a  maximization  problem  and  the  optimal  strategies  are  derived  from  the  interaction  of 
two  special  convex  sets.  It  is  shown  that  this  maximization  problem  can  be  approxi¬ 
mated  in  a  straightforward  and  intuitively  satisfying  manner  by  a  linear  programming 
problem. 

In  the  approach  used,  the  state  vector  of  the  game  is  a  parameter.  For  this 
reason  the  continuity  properties  of  the  functional  dependence  of  the  value  and  the 
strategics  upon  this  parameter  ar<  investigated.  One  result  is  that  for  a  game  with 
quadratic  payoff  and  linear  dynamics  the  value  function  is  piecewise  quadratic. 
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Because  U.S.  Air  Force  systems  be  they  «i.->ile,  space,  tactical, 
aeronautical,  or.  other  systems  inevitably  are  to  be  utilized  in  competitive 
(differential  game)  situations  and  because  a  comprehensive  theory  of 
mathematical  games  has  yet  to  be  developed  the  results  presented  in  this 
report  were  evolved  with  this  goal  in  .sind.  Numerous  basic  results  are 
contained  herein  with  the  ultimate  goal  of  a  comprehensive  theory  of 
differential  games  in  mind,  and  the  utility  and  significance  of  the  results 
developed  herein  are  illustrated  by  application  to  numerous  illustrative 
examples. 
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ABSTRACT 


Development  of  a  comprehensive  theory  oi  /sa*:hemr  tical 
games  has  been  hampered  by  philosophical,  conceptual,  sod  prac¬ 
tical  difficulties.  For  dynamic  games  in  particular,  solution  "r.-thods 
are  elusive,  and  algorithms  are  rare.  This  is  especially  apparex,*” 
for  games  which  require  that  the  competitors  randomize,  or  mix, 
their  tactics  even  though  such  randomization  is  a  common  property 
of  actual  competitive  situations.  This  dissertation  is  therefore  con¬ 
cerned  with  the  development  of  a  technique  for  the  synthesis  of  mixed 
strategy  solutions  of  games. 

A  special  class  of  dynamic  games  is  studied:  two-person 
zero-sum  noise -free  multistage  games  r.f  fixed  duration  for  which 
the  payoff  and  dynamic  functions  are  multivariable  polynomials  and 
the  control  vectors  are  elements  of  compact  hypercubes.  The 
problem  is  formulated'  such  that  khov/n  results  concerning  existence 
of  saddlepoint  solutions  are  applicable;  emphasis  is  on  the  deter¬ 
mination  of  the  value  and  of  the  optimal  mixed  strategies  and  on  the 
properties  of  the  solution  functions.  This  is  achieved  by  extending 
and  applying  the  method  of  dual  cones  such  that  the  game  becomes 
a  maximization  problem  and  the  optimal  strategies  are  derived 
from  the  interaction  of  two  special  convex  sets.  It  is  shown  that 
this  maximization  problem  can  be  approximated  in  a  straightforward 
and  intuitively  satisfying  manner  by  a  linear  programming  problem. 

In  the  approach  used,  the  state  vector  of  the  game  is  a 
parameter.  For  this  reason  the  continuity  properties  of  the 
functional  depe  ndence  of  the  value  and  the  strategies  upon  this 
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CHAPTER  1 


INTRODUCTION 

The  mathematical  theory  of  games  is  still  a  relatively 
immature  discipline  with  a  multitude  of  theoretical  and  practical 
problems.  Solution  of  those  problems  will  bring  about  increased 
understanding  of  cooperation  and  competition  in  such  diverse  fields 
as  anthropology,  economics,  military  defense,  diplomacy,  sports, 
and  behavioral  psychology.  It  is  even  possible  that  game  theory 
will  become  a  major  branch  of  applied  mathematics,  for  it  encom¬ 
passes  optimization  theory  as  a  special  case  while  introducing  new 
questions  due  to  its  concern  with  the  interactions  of  multiple  intelli¬ 
gent  participants. 

One  objec.ive  in  the  theory  of  games  is  to  determine,  for 
any  given  situation,  the  best  tactics  for  each  participant  to  use 
and  the  payoff  to  each  when  all  use  their  best  tactics.  In  practice 
the  theory  is  applied  to  a  mathematical  representation,  or  model, 
of  the  actual  situation,  and  the  adequacy  of  a  particular  analysis 
depends  upon  both  the  sensibility  of  the  model  and  the  intuitive 
acceptability  of  the  results.  This  need  for  realism  leads  to  a 
requirement  that  the  theory  be  applicable,  for  example,  to  dynamic 
situations  with  multiple  competitors  whose  knowledge  of  the  true 
situation  may  at  times  be  incomplete,  and  indeed  researchers  are 
attempting  to  resolve  the  mathematical  difficulties  presented  by 
such  cases. 
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It  is  well  known,  however,  that  in  many  types  of  competition 
the  participants  diversify  their  tactics  so  that  under  similar  circum¬ 
stances  their  actions  vary  and  are  unpredictable  to  their  opponents. 
Such  mixing,  or  randomization,  of  tactics  is  common  in  many 
sports  and  parlor  games  and  in  guerrilla  warfare.  It  also  underlies 
such  maneuvers  as  bluffing  and  feinting.  Thus  one  would  expect 
the  theory  to  produce  randomized  tactics  as  solutions  of  its  models. 

Surprisingly,  although  solutions  of  games  based  upon  static 
situations  are  often  randomized,  this  is  not  presently  the  case  for 
most  dynamic  games.  Therefore,  this  paper  is  concerned  with 
developing  a  theory  which  produces  randomized  tactics  as  needed  in 
the  solution  of  a  particular  class  of  dynamic  games.  The  class 
studied  is  that  of  perfectly  competitive  situations  with  only  t.vo 
participants,  the  so-called  two-person  zero-sum  games.  The 
dynamics  of  the  game  are  modeled  by  multistage  equations, 
and  each  player  knows  all  pertinent  information  concerning  the  game 
except  the  future  tactics  of  his  opponent.  The  dynamics  and  payoff 
functions  which  define  the  game  are  multivariable  polynomials. 
Finally,  each  play  of  the  game  lasts  a  fixed  number  of  stages,  and 
the  players  choose  their  control  actions  as  elements  from  compact 
hypercubes. 

Such  specialized  games  should  prove  to  have  wide  application. 
The  two-person  zero-sum  model  is  often  used,  and  multistage' 
dynamics  may  be  more  accurate  fur  representing  applications  sm  h 
as  business  decisions  than  continuous  dynamics.  Furthermore, 
polynomials  are  frequently  employed  in  models  of  real  situations. 
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Particular  applications  which  may  be  foreseen  include  pursuit- 
evasion  and  weapon  allocation  problems  for  defense  purposes, 
optimal  pricing  and  advertising  determination  for  direct  business 
competitions,  resource  allocation  for  political  campaigns,  and 
perhaps  even  game  plan  determination  for  some  sports  and  parlor 
games.  The  theoretical  results  of  this  paper  will  allow  approximate 
solutions  of  these  and  other  problems  for  which  suitable  models 
of  the  requisite  type  are  derivable. 

The  existence  of  saddlepoint  solutions  using  mixed  strategies 
has  been  established  for  this  class  of  problems,  the  concern  in  this 
report  is  with  developing  a  theory  for  actual  synthesis  of  those 
solutions,  a  task  accomplished  by  extending  the  theory  of  dual  cones 
originally  developed  by  S.  Karlin,  M.  Dresher,  and  L.  Shapley 
for  a  restricted  class  of  static  games.  Suitable  background  material 
and  relevant  definitions  are  in  Chapter  2.  The  theoretical  develop¬ 
ment  begins  in  Chapter  3  with  a  precise  definition  of  the  problem 
of  interest. 

The  principal  the  oretical  results  and  discussions  concerning 
approximate  solutions  are  in  Chapters  4  and  5.  In  the  first  of  these, 
the  problem  is  attacked  by  solving  a  special  static  game.  The 
solution  is  obtained  by  reducing  the  problem  of  finding  the  optimal 
mixed  strategies  to  a  problem  of  determining  the  generalized 
moments  of  such  strategies.  Next  the  sets  of  admissible  moments 
and  certain  convex  cones  which  they  generate  are  described.  The 
value  of  the  game  and  the  optimal  moments  are  then  obtained  by 
exploiting  features  of  the  dual  convex  cones.  The  chapter  is 
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concluded  with  discussions  of  compuia.cional  aspects  of  the  solution 
method,  including  an  approximate  formulation  as  a  linear  pro¬ 
gramming  problem. 

In  Chapter  5  the  effects  of  introducing  the  dynamic  aspects 
of  the  game  are  examined.  The  essence  of  the  approach  is  that  the 
dynamic  game  is  reduced  to  a  sequence  of  static  games  in  which  the 
system  state  is  a  parameter.  The  applicability  of  the  method  of 
dual  cones  to  finding  open-loop  and  closed-loop  optimal  strategies 
is  discussed.  Then  continuity  properties  of  the  value  and  of  the 
optimal  strategies  as  functions  of  the  state  vector  are  evaluated 
in  detail.  Finally,  the  dual  cone  approach  is  utilized  to  prove  that 
the  value  of  games  with  linear  dynamics  and  quadratic  payoff  is 
piecewise  quadratic. 

Chapter  6  is  devoted  to  four  examples  which  illustrate  various 
aspects  of  the  theory  developed  in  Chapters  4  and  5.  Chapter  7  con¬ 
tains  a  brief,  formal  discussion  of  the  extension  of  the  methods 
developed  in  this  report  to  differential  games.  A  summary  of 
results  and  a  look  to  the  future  comprise  the  concluding  chapter, 
Chapter  8. 

The  original  contributions  of  this  work  arc  embodied  in  the 
extension  of  the  method  of  dual  cones  to  include  vector  control 
elements,  the  creation  of  a  solution  technique  based  upon  that 
method,  the  manner  of  formulating  the  approximate  problem  so  that 
linear  programming  may  be  applied,  and  certain  aspects  of  the  use 
of  the  method  for  multistage  games.  Among  the  last  of  these, 


the  proof  that  certain  linear-quadratic  games  have  piecewise- 
quadratic  value  functions  is  original,  as  are  portions  of  the  argu¬ 
ments  concerning  continuity  of  the  optimal  mixed  strategies.  The 
discussion  of  the  extension  to  differential  gnmes  also  contains 
original  elements. 


s 


CHAPTER  2 


BACKGROUND 

Hundreds  of  research  works  concerning  various  aspects  of 
game  theory  have  been  published  since  the  field  was  founded  by 
'on  Neumann  and  Morgenstern  in  1944  [l].  In  this  chapter  we 
review  the  history  and  the  commonly-used  definitions  for  the  control 
systems -oriented  branch  of  mathematical  game  theory  to  which  the 
present  study  belongs.  Section  2.  3  contains  a  survey  of  the  literature 
which  is  particularly  relevant  to  the  synthesis  of  mixed  strategies 
for  dynamic  games. 

2.  1  TERMINOLOGY 

Useful  insight  into  a  situation  can  often  be  obtained  simply  by 
reviewing  its  terminology.  This  is  definitely  the  case  with  game 
theory.  Thus  it  is  fruitful  to  consider  definitions  and  concepts  at  this 
point.  This  terminology  is  relatively  standard  for  the  field,  and  we 
shall  neither  probe  its  nuances  nor  attempt  to  compile  a  dictionary. 

A  game  is  the  complete  set  of  rules,  definitions,  constraints, 
goals,  etc.,  which  describe  a  multi-participant  interaction,  whether 
it  be  competitive  or  cooperative.  The  participants  are  called 
players,  and  ii  Id  sre  are  n  such  participants,  the  game  is  called  an 
n-player  or  r.-person  game.  A  single  contest  or  realization  of  the 
game  is  called  a  play  or  partie. 

in  a  uon-trivial  game,  the  players  are  able  to  affect  its  course 

th 

and  outcome.  Mathematically  it  is  said  that  the  j  player  does  this 
by  choosing  a  control  or  control  vector  u;,  or  by  choosing  a  sequence 
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u.  =  £ju_j }  or  a  time  history  u^  =  u^(t)  of  such  vectors.  Ordinarily  the 

control  vector?  are  chosen  from  some  set  U.,  called  the  set  of 

J 

admissible  controls. 


To  further  his  own  best  interests  during  a  partie,  a  player 
does  not  usually  behave  haphazardly.  Instead  he  uses  a  strategy,  or 
set  of  rules  which  govern  his  choice  of  controls  depending  upon  his 


observations  of  the  course  of  the  partie.  Thus  a  strategy  might  be 
thought  of  as  a  mapping  0  from  the  set  of  all  possible  observed  situa¬ 
tions  into  the  set  of  admissible  controls.  If  the  control  implied  by  a 


strategy  is  always  a  unique  function  of  the  situation,  then  the  strategy 
is  called  a  pure  strategy.  On  the  other  hand,  if  the  rule  assigns 
control  vectors  to  a  situation  in  a  manner  which  involves  randomness, 


then  it  is  called  a  mixed  or  randomized  strategy.  The  essence  of  a 
mixed  strategy  is  the  relative  frequency  of  utilization  of  various 


control  vectors  rather  than  the  randomization  mechanism,  and  it  is 


therefore  common  to  refer  to  probability  measures  defined  on  the 

sets  of  admissible  controls  as  mixed  strategies.  Controls  with 

nonzero  probability  measure  in  a  given  situation  are  the  ones  which 

are  candidates  for  utilization,  and  these  are  said  to  belong  to  the 

spectrum  of  the  mixed  strategy.  Note  that  control  vectors  chosen 

using  mixed  strategies  are  random  variables. 

Seme  games  operate  within  a  framework  or  system  which 
» 

evolves  over  time  (or  some  other  parameter)  in  a  manner  which  is 


important  to  the  structure  of  the  ge.me.  We  call  such  games  dynamic 
game s ,  and  their  complement  we  call  static  game1?.  The  dynamic 
system  is  usually  described  mathematically  using  a  state  or  state 
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vector  z  which  is  a  function  of  the  controls  and  of  other  parameters. 
The  progression  of  the  state  during  a  partie  is  described  by  a 
dynamics  equation,  which  may  be  a  differential  equation, 

z  —  f  (z,  .  •  • ,  t)  (2. 1) 

or  a  difference  equation 

z(i+l)  Uj (i), . . . ,  u n(i);  i)  (2.2) 

In  the  former  case  the  game  is  called  a  differential  game,  and  in  the 
latter  it  is  referred  to  as  a  difference  game,  a  discrete  differential 
game,  or  a  multistage  game.  A  dynamic  game  whose  rules  prescribe 
that  a  partie  proceeds  for  exactly  T  time  units  or  N  stages  is  called 
a  game  of  fixed  duration. 

Along  with  the  direct  complications  which  dynamic  games 
introduce  come  several  conceptual  problems.  An  important  one 
of  these  is  that  the  nature  of  strategies  must  be  further  refined  to 
account  fir  whether  the  players  are  allowed  to  expect  to  have  know¬ 
ledge  of  the  state  whenever  they  choose  control  vectors.  If  not, 
then  they  must  consider  the  possibility  of  making  open-loop  control 
choices  when  they  design  their  strategies,  and  the  resulting 
strategies  are  called  open-loop  strategies.  If  the  rules  allow  them 
to  expect  that  they  may  always  have  up-to-date  observations  on  which 
to  base  their  control  choices,  then  they  may  design  closed-loop 

strategies  which  depend  upon  those  observations.  For  example,  in 

til 

one  simple  differential  game  the  l  player  may  be  required  to  gen¬ 
erate  an  open-loop  mixed  strategy  function  represented  by  a 
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conditional  cumulative  distribution  function  F.(u.(t)|  z(t),  u.(s);  Tssst), 

1—4  —  —4 

whereas  in  another  such  game  he  may  design  a  closed-loop  mixed 
strategy  with  c.d.  f.  F^(ii,{t)|  z(t)).  Clearly  these  concepts  are  gen¬ 
eralizations  of  the  ideas  oi  open-loop  end  closed-loop  controls.  Note 
that  the  strategy  type  is  determined  by  the  mice  of  the  game  rather 
than  by  conditions  obtained  during  a  partie  of  that  game. 

Ultimately,  each  player  in  a  game  strives  to  achieve  some 
;.oal.  For  mathematical  games  this  fact  is  represented  by  associ¬ 
ating  with  each  player  j  a  payoff  functional,  which  for  each  partie 
assigns  to  that  player  a  real  number  J.  that  depends  upon  the  struc¬ 
ture  of  the  game  and  the  course  of  the  partie.  In  particular,  if 
u,,  j=l,  2, ... ,  n,  denotes  control  histories  and  z  denotes  state 
histories,  then  we  write 


W-’  &,-"V 


j=l , 2, • ■  . ,  n 


(2.3) 


£v 

j*i  ... 


to  represent  the  payoffs.  A  game  for  which  )  .  J,  =  0  is  called  a 
zero -stun  game;  any  other  game  is  nonzero-sum.  Depending  upon 
the  nature  of  the  game,  the  payoffs  may  belong  to  finite  or  infinite 
sett?  and  may  be  bounded  or  unbounded. 

Each  player  in  a  game  chooses  his  control  history  during  a 
partie,  and  thus  designs  his  strategy, to  best  serve  his  own  interests. 
The  exact  nature  of  "best"  is  dependent  upon  the  rules  of  the  gams; 
for  example,  a  player  may  in  some  games  submerge  his  direct 
interests  to  those  of  a  group  and  in  other  games  may  strive  for 
maximum  security  of  payoff  rather  than  to  maximum  payoff.  Fur¬ 
thermore,  frequently  a  function  of  the  payoff  such  as  its  mean  is 
extremized  rather  than  the  raw  payoff.  In  any  case,  if  it  is  possible 
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for  each  player  to  design  a  strategy  which  in  the  game  sense  best 
serves  his  interests  in  terms  of  a  function  CS.  of  the  payoffs,  then 

J 

his  payoff  when  all  players  use  such  optimal  strategies  is  called 
the  value  of  the  game  o  him.  We  write  this  as 

Wj  =  val  {  (Jj,  J2, . .  . ,  Jn))  j=l,  2 . n  (2.4) 

Because  the  exact  nature  of  the  maximization  is  so  intimately  related 
to  the  particular  structure  of  a  game,  it  is  generally  difficult  to  be 
more  definitive  than  this  except  for  one  particular  class  of  prob¬ 
lems,  the  class  of  two-person  zero-sum  games. 

Two-person  zero-sum  games  are  the  subject  of  intense 
research  interest  and  accordingly  are  the  source  of  considerable 
specialized  terminology.  In  such  games,  it  is  possible  to  define 
a  single  payoff  function  J  which  has  the  property  that 

J  =  Jx  =  -  J2  (2.  5) 


Such  games  are  often  called  perfectly  competitive,  since  by  their 
nature  one  player's  gain  is  the  other's  loss.  In  these  games  a 
rational  player  attempts  to  maximize  his  minimum  possible  expected 
payoff;  i.  e. ,  Player  I  attempts  to  maximize  the  minimum  possible 
tf(J)  and  Player  II  tries  to  minimize  the  maximum  of  (f(J).  If  we  call 
the  strategy  sets  for  the  players,  i=l,  2,  then  we  write 


C  max  min 
J1 
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T  -  min  max  r<f/m 
J2  "</>2€#2 


as  the  goals  of  the  two  players.  If  =  J^,  then  this  common  number 
is  by  definition  the  value  of  the  game.  It  is  clear  that 


W  =  i,  .  i2  «  «J)|  •  «1  SjGTj.  J2) 

Vi»V2 


has  the  property 

?(J)I  0SwS^J)io  <2.  6) 

01,02 

where  the  notation  indicates  that  the  payoff  function  is  to  be  evaluate  \ 
using  the  optimal  strategy  0°€  and  any  admissible  strategy 
0  .cty.,  j^i.  Condition  (2.  6)  \a  called  a  saddlepoint  condition,  and  a 
strategy  0°  which  yields  this  condition  is  called  an  optimal  strategy, 
a  saddlepoint  strategy,  or  a  mini -max  strategy.  These  notions  are 
also  used  in  some  other  classes  of  games. 

If  at  least  one  player  in  a  game  lacks  some  essential  piece 
of  information,  such  as  exact  knowledge  of  the  state  vector,  the 
nature  of  the  dynamics,  or  the  payoff  for  some  player,  then  the  game 
is  cahed  a  game  of  imperfect  information,  or  a  stochastic  game; 
otherwise,  the  game  is  one  of  perfect  information.  Common 
dynamic  games  of  imperfect  information  are  those  for  which  at 
least  one  player  has  knowledge  of  a  vector  function  of  the  state, 


Xi  =  li 


(2.7) 
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where  w  is  a  random  vector,  rather  than  of  the  state  vector  z,  or 
where  the  dynamics  functions  depend  on  a  random  vector  £_  as  well 
as  on  the  state  and  the  controls.  Many  games  do  not  fall  naturally 
into  either  category,  and  their  precise  classification  must  be  by 
convention.  We  shall  use  the  following  convention:  if  the  controls 
and  state  are  random  variables  due  solely  to  the  use  of  mixed 
strategies  and  the  participants  have  equivalent  knowledge  of  the 
game,  than  we  shall  call  it  a  game  of  perfect  information. 

With  the  above  concepts  in  mind,  we  are  able  to  characterize 
a  great  variety  of  mathematical  games.  In  this  report  are  described 
the  optimal  mixed  strategies  for  two-person  zero-sum  multistage 
games  with  fixed  duration  and  perfect  information  and  with  payoff 
and  dynamics  functions  characterized  by  polynomials.  Both  open- 
loop  and  closed-loop  strategies  are  examined. 

2.2  THE  HISTORY  OF  GAME  THEORY 

A  great  amount  of  research  concerning  mathematical  game 
theory  has  been  published:  A  bibliography  compiled  in  1959  [2]  has 
more  than  one  thousand  entries,  and  a  recent  bibliography  of 
differential  games  [3]  contains  over  two  hundred  references  and  is 
still  incomplete.  Therefore,  any  overview  of  the  field  is  useful  but 
necessarily  cursory.  This  section  reviews  the  history  of  the  branch 
of  game  theory  which  is  most  closely  related  to  this  report. 

Although  there  are  earlier  relevant  publications,  it  is  gen¬ 
erally  conceded  that  game  theory  had  its  birth  with  the  publication 
of  the  classic  work  of  von  Neumann  and  Morgenstern  [l].  Besides 
creating  the  field,  these  researchers  contributed  some  standard 
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results,  the  most  important  being  a  theorem  proving  that  for  static 
two-person  zero-sum  games  for  which  the  controls  must  be  chosen 
from  finite  sets,  optimal  strategies  and  a  value  would  exist  provided 
that  mixed  strategies  were  allowed.  This  was  later  proven  in 
alternative  ways,  among  them  the  dual  theory  of  linear  programming, 
and  it  was  shown  that  the  mixed  strategies  could  be  computed  using 
linear  programming  (See,  for  example,  Gassl4]). 

Following  the  publication  of  that  book,  game  theory  was  the 
subject  of  intensive  research  interest  for  several  years.  Interest  in 
static  game 8  v/as  particularly  high,  and  among  the  results  are 
algorithms  for  solving  general  two-person  zero-sum  games  with 
finite  control  sets  and  theorems  showing  that  a  value  and  optimal 
mixed  strategies  exist  for  certain  two-person  zero-sum  games  with 
infinite  control  sets.  The  former  fact  was  alluded  to  in  the  preceding 
paragraph,  and  initial  versions  of  the  latter  are  attributed  by  Kuhn 
and  Tucker  [  5],  among  others,  to  J.  Ville  and  to  A.  Wald.  Black- 
well  and  Girshick  [6]  supply  a  fairly  comprehensive  discussion  of 
the  mini-max  theorem. 

.Along  with  these  general  results,  many  special  two-person 
zero-sum  games  were  examined,  including  iu  particular  the 
so-called  games  over  the  unit  square,  in  which  the  players  choose 
controls  as  real  numbers  from  the  unit  interval  [0,  l]  and  the  payoff 
functions  are  of  special  forms,  such  as  polynomials  or  convex 
functions.  An  excellent  source  for  this  period,  with  interesting  and 
enlightening  commentary  by  the  editors,  is  the  series  Contributions 
to  the  Theory  of  Games  [5],  [7],  [8],  [9]. 
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A  new  dimension  was  added  to  game  problems  in  the  middle 
1950' 8  by  Isaacs  when  he  created  dynamic  games,  particularly  two- 
person  zero-sum  differential  games  [10],  [ll],  Cl2],  113].  His 
highly  original  work  is  available  as  a  book  [14]  which  is  best  read 
along  with  a  book  review  by  Ho  £  1 53-  In  brief,  Isaacs  is  concerned 
with  examples  of  problems  with  dynamics 

i  =  £\ (z,  u,  v;  t)  z(C)  =  zQ 


and  payoff  function  , 

v 

c 

J(z0»  ]*(*)»  v(t))  =  g^z^,  tc)  +  J'gMt),  uft)*  V(t),  t)  dt  (2.  8) 

0 

where  tc  is  the  time  at  which  a  given  terminal  manifold  is  reached 
and  £c  is  the  final  position  on  that  manifold.  He  assumes  that  the 
payoff  has  a  saddlepoint  when  pure  strategies  are  used  and  argues 
that  if  the  value  function  J#(z,  t)  exists,  it  satisfies  his  Main  Equation 
One,  or  ME^, 

+  min  max  |(^J*)T  fl(z,  u,  v,  t)  +  g{z,  u,  v,  t)J  =0  (2.9) 

where^  is  the  gradient  operator.  To  find  this,  he  applies  what  he 

calls  the  Tenet  of  Transition,  a  game  theory  analog  ox  Bellman's 

Principle  of  Optimality  which  he  apparently  found  independently.  In 

principle,  (2.9)  may  be  solved  for  u°  =  u°(z,  V  J*,  t)  and 

—  —  z 

v°  =  v°(z,  J#,  t),  which  are  then  inserted  to  give  the  Main 

Equation  Two,  or  ME^, 


IS 


(2.10) 


£(*.  £%,  t).  v°Cz,  t).  t) 

♦  g(*»  w°(*s  ^  J*,  t),  v°(z.  t),  t)  =  0 


TMs  equation  is  of  Hamilton- Jacobi  type,  and  is  commonly  referred 
to  as  such.  Equation  (2. 9}  is  often  called  a  Hamilton-  Jacobi-Bellman 
equation  or  a  pre -Hamiltonian  equation. 

Using  hit  main  equations.  Isaacs  also  contributes  a  sufficiency 
theorem.  In  essence,  he  finds  that  if  J*{z,  t)  is  a  unique  continuous 
function  satisfying  the  ME's  and  the  boundary  condition  J*(zc,  t^)  = 
gjCz^.,  t^j,  then  J*  is  the  value  w(z,  t)  of  the  game  and  any  pure 
strategies  which  famish  the  min -max  in  (2. 9)  and  cause  the  desired 
end  pofn£  to  be  reached  are  optimal.  This  is  true  in  a  limiting  sense, 
that  is,  as  the  limit  of  a  convergent  series  of  discrete  approxi¬ 
mations  to  the  differential  game. 

Interest  in  differential  games  built  up  gradually  for  several 
years  and  culminated  in  a  major  work  by  Berkovitz  Ll6}.  who 
extended  results  of  the  classical  calculus  of  variations  to  zero-sum 
two-person  differential  games.  His  principal  results  are  that  under 
some  fairly  restrictive  coaditionss  tue  Hamiltonian -like  function 

H{z.  u,  V,  £)  =  £  jf(z,  u,  v)  +  gU,  u,  v)  (2.  11) 


satisfies,  for  optimal  controls  u  and 


o 
v  , 


£  =  HU,  U°»  V° ,  p) 

£  =  -  Yz  H{z,  u°,  v°  ,  £) 

~  (Cont'd) 


(2.12) 
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(2.12) 


4 .  =  0 

^0  J^K.  =  0 


=  0 


U*0  m.k.  =0 


where  K  andTC  are  vector  constraint  functions  on  u  and  j/ , 
respectively,  and and  £  are  associated  multipliers.  He  also 
establishes  a  form  of  Hamilton- Jacobi  equation  (2- 10)  and  sufficiency 
conditions  using  field  concepts.  The  results  apply  to  problems  which 
have  solutions  in  pure  strategies. 

Once  these  basic  results  were  established,  a  great  many 
researchers  applied  them  to  special  cases  and  interpretations,  and 
to  extensions  of  the  same  class  of  problems.  Among  these,  a  very 
influential  work  was  contributed  by  He,  Bryson,  and  Baron  £  17],  who 
studied  a  particular  game  with  linear  dynamics  and  quadratic  payoff 
which  has  pure  strategy  solutions.  Othv  r  contributions  in  the  same 
general  area  of  two-person  zero-sum  differential  games  include 
those  of  Wod^  [18],  Meier  [  19 J,  Meschler  [20],  and  Wu  and  Li  [21  ]. 
Interesting  geometric  work  in  an  augmented  state  space  is  found  in 
works  by  Leitmann  and  others.  Blaquiere,  Gerard,  and  Leitmann 
[223  is  representative  of  this  approach. 

A  variation  of  the  above  differential  game  has  received 
attention  from  several  researchers,  including  some  of  the  prominent 
Russians.  If  the  payoff  for  a  two-oerson  zero-sum  game  is  the  time 
T  of  attaining  a  terminal  manifold,  a  problem  is  created  which  may 
not  end;  i.  e. ,  it  may  be  that  T  =  00 .  Pontryagin  [23]  chows  that  if  an 
optimal  payoff  exists,  his  maximum  principle  may  be  app/ied  to  such 
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Other  results  for  related  problems  appear  in  such  works  as  Chat- 
topadhyay  [24]  and  Varaiya  [25]. 

Research  interest  is  now  shifting  to  games  other  than  two- 
person  zero-sum  differential  games  of  perfect  information  which 
have  pure  strategy  optimal  solutions.  In  particular,  dynamic  games 
with  n  players,  with  imperfect  information,  or  with  mixed  strategy 
solutions  are  being  investigated.  These  areas  overlap,  of  course, 
but  it  is  enlightening  to  consider  them  separately.  The  third  area 
is  surveyed  in  the  next  section. 

The  fundamental  philosophical  problem  of  n-person  games 
and.  the  closely  related  nonzero -sum  games  is  the  definition  of  what 
is  meant  by  a  solution.  There  are  at  least  three  basic  solution  types: 
min-max  for  each  player,  equilibrium  solutions  in  which  no  player  can 
improve  his  payoff  unilaterally,  and  bargaining  solutions  in  which 
no  player  can  change  his  strategy  without  adversely  affecting  at  least 
one  other  player.  Therefore,  the  rules  of  the  game,  and  particularly 
questions  of  agreements  and  side  payments  among  players,  dominate 
the  theory.  References  [l],  [7],  [S],  and  [9]  contain  some  of  the 
relevant  publications  for  static  games.  Case  126]  and  Starr  and  Ho 
[27],  [28],  who  also  have  published  similar  v'orks  elsewhere,  are 
leaders  in  studies  of  the  n-person  differential  game  problem.  In 
particular,  they  have  found  that  when  equilibrium  solutions  are  sought, 
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individual  Hamilton- Jacobi  equations  apply  for  each  player  along  the 
optimum  state  trajectory  and  that  a  method  of  characteristics  is 
sometimes  applicable.  Min-max  solutions  may  be  found  for  each 
player  by  applying  two-person  game  theory,  and  bargaining  solutions 
are  related  to  optimal  control  problems  with  vector  payoff  functions. 

Studies  of  games  with  imperfect  information  have  generally 
been  concentrated  on  two-person  zero-sum  dynamic  games  with  noisy 
state  transition  and  noisy  observations  of  the  state  by  the  players. 

The  fundamental  problem  is  that  a  player  must  base  his  controls  on 
hi  i  available  information,  which  tends  to  be  incomplete  and  inexact, 
and  must  guess  not  only  the  state,  but  what  his  opponent  thinks  the 
state  is,  what  his  opponent  thinks  he  thinks  the  state  is,  ad  infinitum. 
The  payoff  is  usually  taken  as  the  mean  of  the  given  payoff  function. 

Behn  and  Ho  L  29  3  circumvent  some  of  the  computational  prob¬ 
lems  by  assuming  a  control  form  and  then  determining  its  parameters 
based  upon  the  statistics  of  the  noise  processes.  Rhodes  and  Luen- 
berger  C  30  j ,  C  31 3  show  that  a  type  of  stochastic  Hamilton- Jacobi - 
Bellmann  approach  is  applicable  when  the  contenders  are  able  to 
determine  their  opponent's  strategy,  and  it  is  noteworthy  that  their 
results  do  not  require  pure  strategies.  An  interesting  approach  is 
6Ugg©8ted  by  Sugino  [32],  who  postulates  bounded  noise  and  thus  is 
able  to  find  mini-max  strategies  by  using  regions  of  attainability. 
Other  important  research  includes  that  of  Kushner  and  Chamberlain, 
who  in  several  works,  among  them  1 33],  study  the  Markov  process 
characteristics  of  stochastic  games,  and  Bley  and  Stear  [34],  who 
use  a  Bayesian  analysis  of  multistage  games  to  find  conditions 


29 


for  pure  strategies. 

In  closing  this  section,  we  remark  that  there  is  much  to  be  done 

even  in  the  fields  so  far  considered.  It  is  noteworthy  that  much  of  the 

work  on  dynamic  games  since  Isaacs  has  been  so  highly  control  system 

oriented  that  it  has  lead  to  what  has  only  recently  been  recognized  as 

a  distortion  of  the  approach  and  a  lack  of  recognition  of  some  of  the 

peculiar,  fascinating  properties  which  mathematical  games  possess. 

This  fact  has  been  noted  by  Isaacs  [35]  and  Ho  [36],  for  instance. 

2.  3  THE  SYNTHESIS  OF  RANDOMIZED  STRATEGIES  FOR 

BYN5MTC  GAMES - 

Early  researchers  actively  sought  mixed  strategy  solutions  to 
their  static  problems.  We  have  already  noted  that  linear  program¬ 
ming  yields  mixed  strategies  for  two-person  zero-sum  static  games 
with  finite  control  sets.  Other  games,  such  as  games  over  the  unit 
square,  that  is,  games  for  which  the  controls  are  scalars  chosen 
from  the  unit  interval  [0,  l],  were  examined,  and  solutions  were 
discussed  for  two-player  zero-sum  games  for  which  the  payoffs 
are  convex  functions  (Bohnenblust,  Karlin,  and  Shapley  [37]),  poly¬ 
nomial  functione  (Dre  she  r,  Karlin,  and  Shapley  [38]),  and  bell -shaped 
functions  (Karlin  [39]),  among  others.  Many  of  the  results  from  this 
era  may  be  found  in  Karlin's  book  [40]. 

The  research  of  Karlin,  et  al,  [38],  [ 40 ],  and[4l],  on 
polynomial  and  separable  games  is  particularly  relevant  to  our  prob¬ 
lem.  They,  however,  are  concerned  solely  with  static  games  with 
scalar  controls.  They  show  that  for  games  with  separable  payoff 
functions  the  problem  of  finding  optimal  mixed  strategies  can  be 
reduced  to  finding  moments  of  those  strategies.  The  latter  problem 
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is  then  examined  for  games  of  known  value  using  the  method  of  dual 
convex  cones.  Their  concern  is  with  characterizing  the  relevant 
sets,  and  they  consider  neither  synthesis  of  solutions  using  the  dual 
cones,  problems  with  vector  controls,  nor  the  effects  of  introducing 
dynamics  to  the  game. 

Few  other  researchers  have  considered  extending  the  theory 
of  static  games  to  dynamic  games.  Bley  [ 42]  suggests  the  appli¬ 
cation  of  the  theory  of  convex  games  and  works  a  scalar  multistage 
example  in  his  study  of  linear-quadratic  games.  Cliff  [43],  who  is 
generally  discouraging  about  the  utility  of  mixed  strategies  in 
realistic  dynamic  games,  suggests  analyzing  the  pre -Hamiltonian 
using  static  game  theory  and  examines  a  simple  differential  game 
example  using  the  theory  of  bell-shaped  games.  Rhodes  [44]  employs 
arguments  related  to  the  theories  of  convex  and  polyw  nial  static 
games  in  examples  of  linear -quadratic  dynamic  games.  None  of 
these  researchers  is  primarily  concerned  with  synthesizing  mixed 
strategies,  and  their  efforts  in  this  regard  are  confined  to  examples. 

Techniques  other  than  extensions  of  static  game  theory  have 
been  suggested.  In  a  series  of  publications  Berkovitz  and  Dresher 
[45],  [46],  [47]  evaluate  tactical  air-war  problems  which  have 
linear  payoff  and  multistage  limited-linear  dynamics.  Their 
solutions  are  obtained  by  ad  hoc  methods  which  do  not  appear  to  be 
of  general  interest. 

An  interesting  approach  suggested  by  Ho  [48]  and  extended 
by  Speyer  [ 49 J  is  to  force  the  controls  to  be  random  variables  by 
introducing  a  dependence  of  the  controls  on  random  vectors.  Speyer 


doe 8  this  by  choosing  controls  of  the  form 


usKst 


i 


(2.14) 


where  z  is  a  state  variable  estimator  and  £  is  a  white  noise  vector 
with  zero  mean  and  controllable  covariance  Q.  His  problem,  a 
particular  linear-quadratic  game,  is  such  that  only  the  statistics 
of  the  random  variables,  rather  than  their  instantaneous  values,  are 
of  importance,  and  the  problem  becomes  one  of  finding  the  gain  K 
and  covariance  matrix  Q.  Thus  the  problem  is  considerably  dif¬ 
ferent  in  means,  if  not  ends,  from  that  of  synthesizing  the  ran¬ 
domness  by  generating  probability  distributions  for  the  controls. 

In  an  interesting  and  provocative  paper,  Chattopadhyay  [50] 
points  out  that  since  in  the  game  surface  approach  the  normals  to  the 
surface  are  intimately  related  to  the  optima.’  strategies,  finite 
mixed  strategies  might  be  related  to  "mixed  normals. "  Thus  one 
can  in  principle  seek  an  optimal  normal  and  then  relate  it  to  pure 
normals  and  to  mixtures  of  pure  strategies.  As  with  much  of  the 
game  surface  technique,  this  appears  to  be  more  useful  for  supplying 
insight  than  for  construction  of  solutions. 

Another  suggestion  is  made  by  Sarma,  Ragade,  and  Mandke 
[51  3.  Arguing  purely  formally,  they  state  that  the  value  must 
satisfy  a  stochastic  Hamilton-Jacobi-Bellman  partial  differential 
equation  with  simultaneous  extrema  in  the  probability  density 
functions  of  the  mixed  strategies  of  the  two  contestants  in  a  zero-sum 
differential  game.  Existence  or  uniqueness  of  solutions  is  neither 
proved  nor  claimed.  Since  the  concept  of  probability  densities  does 
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not  appear  to  be  useful  (because  they  cannot  represent  pure  strategy 
regions  as  degenerate  cumulate  probability  distributions  can),  it 
is  likely  that  the  particular  result  of  Sarma,  et  al,  will  have  limited 
application. 

Smoliakov  [52]  formulates  the  problem  slightly  differently  tc 
find  mixed  strategies  for  a  two-pe~son  zero-sum  differential  game. 
3y  requiring  that  the  dynamics  equation  hold  in  a  mean  sense 

E(u  [z  u,  v,  t)  =  0  (2.  15) 

rather  than  in  the  absolute  sense,  he  is  able  to  put  the  problem  of 
mini-maxing  the  mean  of  the  payoff  over  the  mixed  strategies  into  a 
form  which  can  be  attacked  by  variational  methods.  The  physical 
significance  of  (2.1  5)  is  debatable,  however. 

Little  other  work  concerning  actual  synthesis  of  mixed  strate¬ 
gies  has  been  performed.  Some  researchers  have  been  unconcerned 
with  synthesis  and  neither  found  nor  ruled  out  mixed  strategies.  The 
publications  of  Rhodes  and  Luenberger  [30],  C 31 3  and  Rhodes  [42] 
are  examples  of  this. 

We  have  already  mentioned  that  much  of  Chapter  4  represents 
extensions  of  the  work  of  Karlin  and  others.  Another  portion  of  the 
foundation  of  our  research  is  the  fact  that  a  saddlepoint  solution 
indeed  exists  for  the  static  and  the  open-loop  problems  formulated, 
for  proof  of  which  Blackwell  ana  Girshick  [6]  is  one  of  many  possible 
references.  For  the  closed-loop  dynamic  problem,  the  dynamic 
programming  approach  is  used.  This  has  been  used  by  a  number  of 
authors;  its  validity  for  the  problems  of  concern  here  has  been  stated 
as  a  theorem,  for  example,  by  Fleming  [53]. 


CHAPTER  3 


PROBLEM  STATEMENT 

This  research  was  motivated  by  the  desire  to  synthesize  solu¬ 
tions  for  a  particular  class  of  mathematical  games,  although  many  of 
the  results  have  a  more  general  domain  of  applicability  than  this.  The 
goal  may  be  stated  as  follows:  we  seek  to  find  the  value  and  the 
cumulative  probability  distributions  representing  the  optimal  mixed 
strategies ,  both  open-loop  and  closed-loop,  for  the  class  of  fixed- 
duration  two-person  zero-sum  multistage  games  characterized  by 
polynomial  dynamics  and  payoff  functions  and  by  noise-free  infor¬ 
mation.  This  statement  is  clarified  and  the  importance  of  such 
problems  is  discussed  in  the  following  sections. 

3.  i  SYSTEM  SCENARIO 

The  systems  of  interest  to  us  are  dynamic  systems  which  pro¬ 
ceed  in  a  step-wise  manner  under  the  influence  of  simultaneous  inputs 
from  two  controllers.  Thus  we  are  concerned  with  sequences  of  real 
■t-vectors  ijs(i)},  m-vectors  [u(i)J,  and  n-vectors  (jv(i)}  (where  i  is 
an  indexing  variable  which  traverses  the  real  integers)  which  are 
interrelated  according  to  the  dynamics  equation 

3(i+D  =iU(i).  u(i).  v(i);  i)  (3.1) 

The  functions  i_  are  presumed  known  to  the  players  and  by  assumption 
are  polynomial  functions  of  their  arguments  z(i),  u(i),  and  v(i)  and 
are  indexed  by  the  stage  index  i.  The  vectors  have  the  following 
additional  properties  for  each  i: 

Preceding  page  blank 
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z{i)  -  Belongs  to  euclidean  l-  space  E^.  Called  the 
state  or  state  vector  of  the  system. 

_u{i)  -  Chosen  from  a  unit  hypercube  U  in  Em, 

U  =  {u|u.  *[0.1],  i=l,  2,  . . . ,  m), 

(3.2) 

by  a  rational  controller  called  Player  1  or  the 
maximizer. 

v{i)  -  Chosen  from  a  unit  hypercube  V  in  £n, 

V  =  (v|  ^  CIO,  1  ],  i=l,  2,  . . . ,  n} ,  by  a  rational 
controller  called  Player  II  or  the  minimizer. 

A  game  may  be  described  for  this  system  by  introducing 
rules  and  a  payoff  function.  We  are  concerned  with  games  such  that 
a  particular  play,  or  partie,  proceeds  from  a  given  initial  state  z, 
which  is  identified  with  stage  1,  i.e.  ,  z(l)  =  z,  for  a  fixed  number 
N  stages.  Two  variations  on  the  basic  rules  are  of  interest. 

In  the  first  game,  called  the  game  of  closed-loop  strategies, 
each  controller,  cognizant  of  the  state  z(i),  of  the  history  of  play 

{i.e.,  of  z(l),  £(2),...,  z(i-l ),  u(l),  u{2),  . .  .  ,  u_(i  - 1 ),  v<  1  >,  v(2) . 

v(i-l)),  of  the  dynamics^  and  the  payoff  function  J,  and  of  the  number 
N-i  of  remaining  stages,  but  ignorant  of  the  other  controller's  future 
control  vector  choices,  chooses  a  control  vector  from  his  set  of 

admissible  controls  U  (or  V).  This  happens  for  each  i,  i=l,  2 . 

N;  each  participant  fully  expects  it  to  do  so  and  hence  designs 
closed-loop  strategies. 


In  the  second  game,  called  the  game  of  open-loop  strategies, 
the  controllers  cannot  be  certain  of  ever  receiving  updated  data. 

For  this  reason  they  design  open-loop  controls  to  use  for  the  re¬ 
mainder  of  the  game,  and  recompute  these  if  any  new  data  become 
available.  Data  are  assumed  available  to  both  players  or  to  neither; 
they  have  equivalent  knowledge  of  the  state. 

For  either  of  these  variations,  at  the  end  of  the  partie  a 
scalar  amount  J  determined  by 

J  =  J (z;  u(  1 )»  u{2),  ....  u{N),  v(l),  v(2),  ....  v( N ) ) 

(3.  5) 
N 

"  %+i(£(N+1))  +  L  «£<£<*>*  —(i)* 

i=l 

is  paid  by  Player  II  to  Player  I.  The  functions  g^,  1=1,2,...,  N+l, 
are  assumed  to  be  polynomial  functions  of  their  arguments. 

By  describing  the  dynamics,  rules,  and  payoff  function,  we 
have  defined  a  game.  The  concepts  of  solutions  to  this  game  are 
pursued  in  the  next  section,  and  the  particulars  of  solutions  are 
treated  in  Chapters  4  and  5. 

3.  2  THE  CONCEPT  OF  SOLUTION:  VALUE  FUNCTIONS  AND 

STRATEGIES - 

The  two  players  in  the  game  of  Section  3.  1  are  presumed  to  be 
both  intelligent  and  rational  in  that  each  will  attempt  to  optimize  the 
payoff  J  according  to  his  own  best  interests.  To  ensure  his  success, 
each  player  employs  a  strategy,  which  we  may  think  of  as  a  rule  or 
mapping  which  implies  an  admissible  control  vector  for  each  con¬ 
tingency  in  the  game,  that  is,  for  each  possible  position  £  and  stage  i. 
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If  a  unique  control  vector  is  implied  by  this  function  for  each  con¬ 
tingency,  then  the  function  is  called  a  pure  strategy.  If  the  mapping 
also  depends  on  a  random  variable,  so  that  the  selected  control 
depends  upon  the  realized  value  of  this  random  variable  in  addition  to 
z_  and  i,  then  the  function  is  called  a  randomized  or  mixed  strategy. 

It  is  clear  that  a  pure  strategy  is  a  special  case  of  mixed  strategies. 

Since  finding  good  strategies  for  the  competitors  is  funda¬ 
mental  to  solving  games,  we  must  refine  the  notion  of  mixed  strate¬ 
gies.  The  key  concept  is  that  at  each  stage  each  player  chooses  his 
control  vector  in  a  (possibly)  random  manner.  The  exact  means  of 
introducing  the  randomness  is  incidental;  the  crucial  factor  is  the 
relative  frequency  of  utilization  of  the  elements  of  the  admissible 
control  set.  In  other  words,  the  important  aspect  of  mixed  strategies 
is  that  they  are  related  to  probability  measures  defined  over  the  set 
of  admissible  controls.  Thus  part  of  our  objective  is  to  find  for  each 
player  a  best  mixed  strategy,  where  by  mixed  strategy  is  meant  a 
cumulative  distribution  function,  or  c.  d.f.  ,  defined  over  the  set  of 
admissible  controls  and  parameterized  as  necessary  by  the  state  z 
and  stage  index  i. 

Since  randomness  was  introduced  via  mixed  strategies, 
the  payoff  function  is  a  random  variable  and  the  state  is  a  Markov 
sequence.  Hence.it  is  reasonable  that  the  contenders  should  wish  to 
optimize  a  statistical  function  of  the  payoff  J,  in  our  case  the  mean. 
Therefore,  in  the  games  considered  here,  Player  1  is  to  use  a 
strategy  such  that  the  minimum  achievable  mathematical  expectation 
of  J  is  maximized,  and  Player  II  will  adopt  a  strategy  which 
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minimizes  the  maximum  achievable  expectation  of  J.  The  mean  of  J 
for  a  given  initial  condition  z  when  both  players  uee  their  best 
strategies  is  known  {Seet  e.  g. ,  Blackwell  and  Girshick  [6]  and 
Fleming  [53])  to  satisfy  the  saddlepoint  condition  (2.  6)  for  games  of 
the  type  considered  here  and  therefore  is  called  the  value  w (z)  of  the 
game. 

Let  us  make  the  above  paragraphs  more  precise  for  the  two 
variations  of  our  basic  game.  To  do  this,  we  first  introduce  the 
notion  of  the  truncated  game  i,  which  is  the  game  which  starts  at 
stage  i  and  position  z(i)  and  continues  for  N-i  stages.  The  payoff  for 
this  game  is 

J.  =  J.(z;  u(i),  u(i-J-l),  .  .  . ,  u(N),  v(i),  v(i+l),  ....  v(N)) 

N  "  (3'4) 

=  gN+1(£(N+1))  +53  8^-^*  -(k)* 

k=i 

For  the  game  of  closed  loop  strategies,  we  seek  optimal 
cumulative  distribution  functions  (c.d.f.'s)  F°(u(i) j  z(i),  i)  and 
G°(v(i)|  z(i),  i)  defined  for  the  maximizer  on  U  and  for  the  minimizer 
on  V,  respectively,  such  that  for  each  j  =  l,  2,  ... ,  N,  and  for  each 
i= j,  j+ 1 .... ,  N,  the  value  of  the  truncated  game  j  is  given  by 
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(3.  3) 


j(*(j))  ~j J-  •  -JJ Jj{z(j );  u(j), ...»  u(N),  v£j), ...»  v(K)) 
V  U  V  u 

dF°{u{N)|  z(N),  N)  dG0{v{N)|z(N)f  N).  - . 

. . .  dF°(u(j)U(j),  j)  dG°(v(j)U(j),  j) 


nun 

=  g.  €  r. 

i  i 
i=j»  •  •  * » 


N 


(z(j);  U(j)»  •  •  • »  U(N),  v(jh  . . . ,  v(N)) 


V  U  V  u 


dF°(u(N)j  z(N),  N)  dGN(v(N){  z(N),  N). . . 
. . .  dF°(u(j)|  z(j),  j)  dGj(v(j)|  z(j),  j) 


Fi  f.  ^  f  f’ '  ‘f  f 3 '  *  *  ’ 

i=j>  •  •  •  *  N  v  u  v  U 


dFN(u(N)|  z(N),  N)  dG°(v(N)|  z{N).  M). . . 


. .  .dFj(u(j)|z(j),  j)  dGC(v{j)|  z(j),  j) 


Here  I\  and  are  the  sets  of  all  admissible  conditional  c.d.  f/s 

defined  on  V  and  U,  respectively.  That  such  a  w.(z)  indeed  exists 

J 

is  known  from  Fleming  [53];  this  function  is  discussed  further  in 
Chapter  5  when  dynamic  programming  is  considered. 

For  the  game  of  open  loop  strategies,  the  players  of  the 
truncated  game  j  must  develop  their  strategies  under  the  assumption 
that  /,(i),  j,  may  never  be  known  to  them.  Hence,  in  this  case  the 
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c. d. £-’*  *o®gi£.  cate  be  eoodssSssed  ctesly  &s  *££1  aadl  oa  lie  pSayer"*  c»s 
c®st£©5s.  We  tierefore  need  c-d-f.8*  c)]),  — , 

ss£  G“(r{:jS  z(j!,  j;  —  .  vii~ls)  ior  wiseb  Jfe 


raise  fcacffios  w-  *asi*S** 
J 


WjUOW  ■//•/A  C*j[jfe  s(jS>  -  -  -  .sjP*),  »JjK  -- ->  ?&*}} 


(3-fe> 


VC  VO 


4F°HK#!zijJ,  ;;  =0h  •  -  - . 

dG° (MN)  J  z(j},  j;  v(jl,  ....  v{S-l)J- . . 

...dF°(u(j*U(j).  j)  dd°(v(j)i£Ui.  ji 


Buu 


G.cf.  JHP  j(z(jh  a(j).  —  >  «4N},  v(j),  — .  v(Nf) 

isj, . . . ,  N  y  p  v  fj 

dF°(u{N)|  *CJ),  j;  a(j), . .  * ,  u(N-l)) 

*  •  *  *  *  * 

...dF°(u{j)UO),  i)  dG^vUJjzij),  j) 


mux 


fA  J  S' *'f ft'  *  *  *  * 

i=j,  ...,Ny  u  V  U 

dF^(u(N)l£(i),  j;  u(j), ....  u(N-l)) 

dG°{v(N}|  z(j),  j;  v(j),  v(N-U) — 

. .  .dF.{u(j)U(i),  j)  dG°{v{j)iz(j),  j) 


v(N)) 
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Again  ♦-  sad  r  ire  the  jctj  of  admissible  conditioned  e-d-f/s,  baz 
they  are  aoc  identical  bo  :bose  of  the  ejosed-lnop  strategy  aw. 
wHcfc  has  a  different  structure-  It  is  demonstrated  in  C cspeer  5  that 
g?ds  case  reduces  to  a  parameterized  static  case,  so  that  standard 
sdS'BOx  theorems  are  applicable  (e.  g. ,  see  SkckweH  and 
GIr  thick  Loj}. 

3.3  THE  IMPORTANCE  OF  gQLYKOMLUL  N-STAGE  GAMES 

The  class  tf  games  considered  here  is  a  special  one;  removal 
of  the  ttrc-sass  assumption  or  introduction  of  stochastic  obser¬ 
vations  cr  dynamics  seals  create  extremely  difficult  problems  both 
of  concept  and  of  competition.  Nevertheless,  oar  games  are  sot 
trivial.  Two-person  zero-ssss  games  are  good  models  of  parlor 
games  and  satisfactory  approximations  of  man?  other  situations. 
Multistage  dynamics  are  suitable  for  describing  the  manner  in  winch 
many  real  situations  effectively  evolve-  That  the  control  vectors 
most  be  finite  is  eminently  reasonable. 

The  polynomial  approximation  must  be  justified  more  sub¬ 
jectively.  Polynomials  are  widely  used  in  engineering  work  as  the 
next  step  beyond  simple  linear  models  for  many  functions  of  interest 
can  be  approximated  arbitrarily  well  by  polynomials.  Particularly 
when  elaborate,  aesthetically  satisfying  models  prove  insoluble,  the 
solutions  to  polynomial  models  may  be  important  for  themselves 
and  for  the  insight  which  they  provide.  It  can  be  expected  that 
solutions  of  polynomial  games  will  have  similar  utility. 
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CHAPTER  4 


!r. 


?  t 


1  e 


THE  SOLUTION  OF  SEPARABLE  STATIC  GAMES 

la  this  chapter  w«  consider  t&e  solution  of  games  for  wftkb 
Player  I  selects  a  point  oc  OcE51,  Player  H  cmalttseoariy  selects 
r  c  VC  E8,  and  then  Player  Q  pays  to  Player  l  an  atssss  defined  by 
a  faactioa  of  the  form 


£>  =2)  £  “ij  ri&’  *i!-! 


{4-U 


j=0  3=0 


By  making  the  coefficients  a. .  functions  of  a  state  vector  z,  we  will 
in  Chapter  5  relate  this  problem  to  the  multistage  game  problem. 

3?e  remark  that  the  game  with  payoff  {4. 1)  is  known  to  have 
a  vzlse  and  optimum  strategies  provided  that  J(u,  v)  is  continuous, 

U  and  V  are  closed  and  bounded,  and  mixed  strategies  defined  on  an 
infinite  number  of  points  are  allowed.  {See,  for  example,  Blackwell 
and  Girshick  [6],  Chapter  2).  The  results  cf  this  chapter  will  havj 
the  effect  ol  proving  this  independently  since  they  essentially  demon¬ 
strate  the  vr  lue  and  strategies  for  the  class  of  games  considered- 
4. 1  SEPARABLE  PAYOFF  FUNCTIONS  AND  THE  MOMENT 

PrB&LSi3 - 

Single-stage  games  with  payoff  functions  defined  by  poly¬ 
nomials. 


J(u 


■ w  =£  X)  *ij ui  v‘- 

i=0  j=0 


(4-2) 


where  u  and  v  are  scalars,  are  among  tke  simplest  examples  of  a 
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general  class  of  games  with  separable  payoffs,  i.  e. ,  payoffs  of  the 
form 


i=0  j=0 


aij 


(4.1) 


where  r.(u)  and  Sj(v)  are  continuous  functions,  and  where  uc  U, 
vC  V,  for  U  and  V  defined  as  unit  hypercubes  of  dimension  in  and  n, 
respectively. 


U  =  luju^ctC.  1  ],  1=1,2,...,  m;  u  €  Em} 
V=  {v|vf[0.l3,  i=l,  2 . n;  vc  En) 

—  i  — 


For  general  polynomial  payoff  j,  in  which  our  ultimate  interest  lies, 
the  functions  r.{u)  have  the  form 


(4.  4) 


where  the  exponents  k.^  are  non-negative  integers;  the  s-(v)  have 
analogous  forms.  The  importance  of  separable  payoffs  is,  as  we 
shall  develop  below,  the  fact  that  the  problem  of  determining  optimal 
mixed  strategies  may  be  reduced  to  a  problem  of  finding  optimal 
vectors  in  certain  convex  sets. 

To  find  solutions  to  the  game  with  payoff  (4. 1),  we  will 
search  among  the  classes  of  mixed  strategies  for  the  contestants, 
keeping  in  mind  that  pure  strategies  are  specip,l  cases  of  mixed 
strategies.  Thus  let  admissible  strategies  for  Player  I,  the  maxi¬ 
mizer,  consist  of  all  cumulative  distribution  functiono  (c.d.  f.  ’s) 
defined  over  the  set  U,  This  might  also  be  pictured  as  the  class  of 
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joint  distribution  functions  for  the  variables  u  ,  u,, , . . ,  u  .  L.et 

L  c  m 

)  F(u)  denote  an  admissible  c.  d.  f.  Similarly,  let  admissible  strategies 

for  Flayer  11,  the  minimizer,  consist  of  all  c.  d.f.  's  defined  on  V 


and  a  matrix  A  =  la.jj  i, . . . ,|»,  j=0,  l,...,ve  so  that  (4.  8)  be¬ 
comes 


J(F,  G)  =  rT(F)  A  3(G) 


(4.9) 


It  is  often  convenient  to  remove  the  explicit  dependence  on  the 
c.  d.f.  's  F(u)  and  G(v)  by  rewriting  (4.  9)  as 

J (r,  s)  -  r^  A  j*  (4. 1G) 


Let  E  denote  the  set  of  all  vectors  r(F)  obtained  as  F  ranges 
over  all  admissible  cumulative  distribution  functions  on  U,  and  let 
S  similarly  denote  the  set  of  all  j,(G).  Since  r{F)  and  s(G)  are 
moments  of  their  respective  c.  d.f .  ’»  when  the  functions  r.(u)  ard 
s.(v)  are  terms  of  polynomials,  for  the  more  general  separable 
games  it  is  useful  to  think  of  the  functions  as  generalized  moments 
and  we  shall  often  refer  to  them  as  such.  By  extension,  R  and  S 
are  called  the  generalized  moment  sets  for  Players  I  and  11, 
respectively. 

The  importance  of  these  transformations  is  that  choosing  a 
c.d.  f.  turns  out  to  be  equivalent  to  choosing  generalized  moments 
for  a  competitor.  Thus  our  eventual  problem,  finding  F°(u)  and 
G°(v)  such  that 


J(F,  G°)  *  J(F°,  G°)  *  J(F°,  G) 


(4.11) 


where  F  and  G  arc  arbitrary  admissible  c.d.  f.  *a  is  equivalent 
to  finding  r°  end  s°  such  that 


(4.12) 


J(r.  s°)  *»  J{r°»  s°)  s  J(£°,  s) 

for  all  £  €  R  and  s  (  S,  and  then  finding  distributions  corresponding 
to  the  optimal  r°  and  £°,  provided,  of  course,  that  the  saddlepoints 
{4. 11)  and  (4. 12)  exist.  This  transformation  of  the  problem  is  a  key 
step  on  the  path  to  solution  cf  our  separable  games  even  though  it  is 
little  more  than  a  change  of  variable. 

4.2  ADMISSIBLE  MOMENTS -THE  SETS  R  AND  5 

The  search  for  the  saddlepoint  implied  by  (4. 12)  requires  that 
the  sets  R  and  S  of  admissible  generalized  moments  be  carefully 
characterized.  They  are  by  definition  the  sets  of  all  moments  gen¬ 
erated  by  the  classes  of  all  cumulative  probability  distributions 
defined  on  the  hypercubes  U  and  V,  respectively.  The  theorem  of  this 
section  allow  s  a  simpler  and  more  meaningful  characterization  of 
the  sets,  and  is  a  generalization  of  a  theorem  of  Dresher,  Karlin, 
and  Shapley  [38].  We  consider  the  set  R  and  note  that  analogous 
results  may  be  obtained  for  S. 

The  following  well-known  lemma  is  necessary  for  the  proof 
of  the  theorem  and  is  also  used  repeatedly  in  later  sections.  A 
proof  is  given  by  Karlin  [40], 

Lemma  A:  If  [Xj  is  the  convex  hull  of  an  arbitrary  set  X 

in  n- space,  then  every  point  of  [X]  may  be 
represented  as  a  convex  combination  of  a  most 
n+1  points  of  X.  Furthermore,  if  X  is  con¬ 
nected,  then  at  most  n  points  are  needed. 
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In  many  applications  of  this  lemma  we  are  particularly  inter¬ 
ested  in  the  fact  that  a  finite  convex  representation  of  a  point  of  the 
convex  hull  of  a  set  is  possible  with  the  dimension  of  the  repre¬ 
sentation  being  of  secondary  importance. 

We  return  to  our  development  of  a  characterization  of  the  set 

t 

R  by  defining  the  set  as  the  surface  represented  parametrically 
as  a  transformation  via  the  functions  r.(u)  of  all  points  in  U,  that 


Crs  {x|*C#+1.at5u>xs  r(Oj  (4.13) 

With  this  set  defined,  we  may  proceed  to  the  following  theorem 
for  which  the  proof  is  nearly  identical  to  that  for  a  less  compre¬ 
hensive  theorem  given  by  Karlin  [40]. 

Theorem  4. 1.  The  set  R  is  the  convex  hull  of  the  set 
defined  by  equation  (4. 13). 

Proof:  Let  D  be  the  convex  hull  of  C^.  Then  we  must 

prove  that  R  =  D. 

(i)  We  prove  first  that  R  C  D.  Assume  the 
contrary.  Then  there  exists  r° €  R  such  that 
r° 4  D.  Now  D  is  the  convex  hull  of  the  con¬ 
tinuous  mapping  of  the  closed  convex  set  U, 
and  therefore  D  is  itself  closed  and  convex. 

But  then  there  must  be  a  hyperplane  with  normal 
vector  h,  which  strictly  separates  r°  from  D, 
i.  e. , 
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» 
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h^r°  -  h^r(u)  ^  6  >  0  *or  all  ji€  U  (4.  14) 

Since  r° €  R,  there  exists  a  c.d.  £.  F°(u)  such  that 


f  iS^)  dF°(u)  =  £° 

U 


(4.15) 


If  we  average  (4. 14)  using  this  distribution,  we 
find 

h^r°  y dF°(u)  -  h^  ^£(u)  dF°(u) 

U  U 

-  hTr°  -  hT£°  *  b  j dF°(u)  =  5  >  0 
U 


(4. 16) 


which  is  clearly  contradictory.  Therefore,  RCD. 


(ii)  To  prove  D  C  R,  we  choose  an  arbitrary 
r°  €  D  and  demonstrate  a  c.d.f.  for  which  the 
generalized  moments  are  r°,  From  Lemma  A, 
since  D  is  by  definition  the  convex  hull  of  C^,  we 
know  that  r°  can  be  represented  as  a  finite  convex 
combination  of  points  of  C^,  each  of  which  is  an 
image  of  a  point  of  U.  Thus 


(4.17) 
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Now  let  I  (u)  represent  the  degenerate  c.  d.  f.  such 
that, for  xf  U, 


dlx(u)  =  0  u^x 


U 


(4.18) 


and  define 


(4. 19) 


where  the  a.  and  u.  are  those  determined  in  (4. 17). 

X  "“4 

Then  it  follows  that 


Au}dF0(u)=^a.r< 
U  i=1 


r(uJ  =  r 


(4.  20} 


Hence  the  c.  d.  f .  F°(u)  yields  r°  and  D  C  R.  Com¬ 
bining  this  with  the  result  of  part  (i),  we  have 
R  =  D  as  required. 

An  immediate  corollary  of  this  is  the  theorem  of  Dresher, 
ot  al  [  38],  which  was  concerned  as  was  the  rest  of  their  work,  with 
scalar  controls  u  and  v  for  the  competitors. 


Corollary 

4. 1-1:  When  the  control  space  U  is  one -dimensional, 

then  R  is  the  convex  hull  of  the  curve  C^  whose 
parametric  representation  is  r  =  £r(t)}  for  t€  [0,  l]. 


0 


0 


') 
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Under  some  circumstances  the  general  formulation  of  CD 

K 

given  by  {4.  i  3)  can  be  simplified.  The  set  U  can  always  be  written 
as  the  cartesian  product  of  smaller  hypercubes.  Suppose 

U  =  UxxU2  (4.21) 


where  Uj  is  m^-dimensional,  is  -dimensional,  and  m^+m^-ra 
and  assume  that  the  functions  r.(u),  i=0, 1, . . .  ,fi,  are  such  that  if  we 


write 


(4.  22) 


then 


r^u)  =  r.(uj)  i=0, 1, . . . 

=  r^u^)  i=?*i+1»  •••»#* 


(4.23) 


Then  if  we  define  the  surfaces 

Jl  i+l 

Cj  =  {x|  xc  E  »  xi  =  ri(£)  *or  •ome  Uj} 

(4.  24) 

HH-. 

C,  =  {x(xe  E  ,  x.  «  r„  .  (t)  for  some  t  e  U-} 

c  ~  —  X  —  —  C. 

and  let  R^,  be  the  sets  of  generalized  moments  corresponding  to 
the  first  and  second  of  (4.  23),  we  have  the  following  useful  theorem: 

Theorem  4.2:  If  there  exists  a  decomposition  of  U  such  that 

(4.  23)  holds,  then  =  C,  x  and  R  =  x  II ^ 

for  Cj,  defined  by  (4,24). 
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Proof:  The  first  statement  follows  directly  from  the 

definitions  of  C^,  C^,  and  C^.  The  second 
statement  is  an  immediate  result  of  the  fact  that 

,  C  |  ,  and 

Cg,  respectively,  as  is  seen  by  using  their 
definitions  along  with  Theorem  4. 1. 

This  simple  theorem  is  particularly  useful  when  the  functions 
r.(u)  each  depend  upon  only  one  component  of  u,  which  we  refer  to 
as  a  situation  with  uncoupled  controls  and  which  is  often  useful  as 
an  approximation  in  engineering  applications.  Under  these  circum¬ 
stances  we  may  order  the  functions  so  that 


R,  Rj,  and  are  convex  hulls  of 


r^u)  =  r^)  i=0,  1, . . . 

ri(u)  =  ri(u2)  i^iij+l, . . .  ,nz 

•  • 

•  • 

•  * 

•  • 

ri(“l  =  ri<V  . Mm 

Then  by  defining 

H  ,  +  1 

C1  =  E  »  x.  =  ^(t),  t c  L  0, 1  ]  i 

ll.  H-  , 

C  =  (x|xc  E  j  J  .  X.  =  r  +.(t),  tc  [0,  1]} 

Mj 

j — 2,  3,  ,  m 


(4.25) 


(4.  26) 


and  letting  be  the  convex  hull  of  C.  for  i=l,  2, . . . ,  m,  we  have 
the  following  corollary: 
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Corollary 

4.2-1:  If  the  controls  u. are  uncoupled,  then  the  surface 

is  the  cartesian  product  of  the  curves  C, 
and  R  is  the  cartesian  product  of  the  convex 
hulls  R^  of  C^,  i=l,  2, . . . ,  m,  that  is 

Cr  =  Ci  x  C2X  C3  x  *x  Cm 

(4.27) 

R  =  B ,  x  R,  x  R-  x.  . .  x  R 
l  &  5  m 

Proof:  The  corollary  follows  from  repeated  application 

of  Theorem  4.  2. 

Note  that  theorem  4.2  and  its  corollary  are  not  trivially  true: 

a  general  parameterized  surface  cannot  always  be  represented  as  a 

product  of  parameterized  subsurfaces. 

4*  3  SPLIT! IONS-THE  METHOD  OF  CONVEX  CONES 

At  this  point  we  are  ready  to  proceed  with  the  development 

of  solutions  to  our  problem.  We  shall  follow  Dresher,  et  al,  L 38 3 

for  the  early  development  and  theorems  4.  3  and  4.  4.  The  key 

result  of  this  section  is  theorem  4.  5. 

Let  us  briefly  review  our  results  so  far.  We  have  found  that 

the  problem  of  finding  a  saddlepoint  in  mixed  strategies  for  J(u,  v) 

as  given  by  (4. 1)  can  be  transformed  to  the  problem  of  finding  a 

saddlepoint  in  the  generalized  moments  £  and  £  for  the  function 
T 

£  A  £  where  xj.  R  and  £C  S.  Furthermore,  we  have  found  that  R  is 
the  convex  hull  of  the  set  defined  parametrically  by  £( u)  as  u 
ranges  over  U  and  that  S  is  the  convex  hull  of  an  analogously  defined 
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set  Cg.  The  definitions  of  R  and  S  imply  that  they  are  compact 
and  convex. 

Rather  than  augment  the  sets  R  and  S,  we  shall  make  the 
convenient  assumption  that  the  functions  r.(u)  and  s^v)  are  such 
that 

r„(u)  =  1 

(4.  28) 

s0(v)  =  1 

so  that  if  re  R,  then  r  =  1  and  if  s€  S  then  s^  =  1.  Also  we  define 
—  0  —  v 

the  sets  SC  S  and  Rc  R  as  the  projected  sets  S  =  IsJ j^C  E^, 
s^  =  Sj,  i=l,  2, ...» v  for  some  s€  S}  and  R  =  {rj  rc  EP,  r.  =  r., 
i=lf  2,  . . .  ,fi  for  some  rje  R} .  These  notational  conveniences  are 
useful  when  considering  convex  cones  and  support  hyperplanes  and 
clearly  lead  to  no  loss  of  generality  in  our  problem  definition. 

We  begin  the  solution  by  defining  the  convex  cones 

P-  =  {r|rc^+1.  £  s  Xx  for  some  X20  and  xe  R} 

(4.  29) 

Pg  -  lj*l  s€  E1^,  £=  X^for  some  X20  and  £€  S} 


Geometrically,  these  are  cones  with  vertices  at  the  origin,  and 

*  A 

with  cross-sections  R  and  S  at  Tq= 1,  Sq=1,  respectively.  Associated 
with  these  cones  are  the  dual  cones  defined  by 


Pjt  =  y  Ic  ^+l»  £T£2  o  for  all  xC  PR} 
Pg  =  Ul  E*'+1,  s Ty*  0  for  all  Pg} 


(4.  30) 
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Note  that  £*  «  dosed  convex  cone,  and  that  r€  is  a 

T 

bewadary  point  cf  PJ^  only  i£  there  exists  xc  R  such  that  r  x  =  0. 
Aaio|Ssi  statements  hold  for  Pg. 

The  relationships  of  the  cooes  sad  dual  coses  are  worth 
a  convex  cose  with  vertex  at  the  origin, 
if  r°  is  at  element  cf  its  boundary,  then  there  will  exist  a  hyper  plane 
of  support  H  to  Pp  at  r°  which  contains  the  origin*  Hence. 

K  =  ixfh  x=  0.  xCET  }  for  an  appropriate  h  such  that 


T 

-o  o  „ 
n  r  =  0 


1  L*  °’  £€  PA 


{4.  51) 


The  representation  h  of  H  thus  belongs  to  P^,  and  in  fact  it  can  be 
shown  to  be  a  boundary  point  of  P* .  Equations  (4.  31)  also  hold  if 
r° €  R  and  rc  R,  provided  that  only  support  hyperplanes  H  to  R  which 
pass  through  the  origin  are  considered.  In  fact,  a  little  reflection 
reveals  that  H  can  be  generated  in  e/*  by  using  support  hyperplanes 

a* 

to  R  which  are  not  constrained  to  pass  through  the  origin,  a  fact 

a 

which  follows  from  the  definition  of  R.  Therefore,  support 

A 

hyperplanes  to  R  are  closely  related  to  the  support  hyperplancs  of 
R  and  of  Pp,  a  useful  property  which  is  exploited  in  later  sections. 
Furthermore,  since  {Pgj  =  Pp,  as  is  easily  shown,  the  support 
hyperplanes  of  correspond  to  boundary  points  of  Pp  and, 

A 

ultimately,  of  R  and  of  R.  The  situation  for  S  and  Pg  is,  of  course, 
analogous. 
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Affcme  that  it  is  known  that  the  value  of  the  game  under 


consideration  is  zero,  that  is 


mis  max  ? ,  n 
at  3  TtS.1  A*  =  ° 


(4.  32) 


Define  the  set 

S(A,  R)  =  E'/+*,  _s  -  aTt  for  some  r_e  kj,  (4.  33) 

which  is  the  image  under  the  linear  transformation  represented  by 
T 

the  matrix  A  of  the  set  R. 

The  following  two  theorems  were  originally  due  to  Dresher, 
et  al,  L  38]  and  are  fundamental  to  our  theory.  Brief  proofs  are 
given  because  they  kelp  illustrate  the  interrelationships  of  the  sets. 
The  proofs  are  basically  due  to  Karlin  [40}. 

Theorem  4.  3:  For  the  game  of  value  zero,  if  R°  denotes  the  set 
of  optimal  strategies  for  the  maximizing  player, 
then 


S(A,  R°)  =  S{A,  R)  n  Pg 


(4.  34) 


Furthermore,  S(A,  R)  does  not  overlap  Pg  in  its 
interior. 


Proof: 


Assume  to  the  contrary  that  the  two  sets  overlap. 

T 

Then  there  exist?  £° C  R  such  that  r°  As^s  6  >  0 
for  all  js  e  S,  implying  that  the  game  has  a  value 
of  at  least  6,  a  contradiction.  Thus  the  second 
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Theorem  4. 4: 


Proof: 


statement  is  established. 

o 

optimal  s araScgte*  exsts,  S  is 
c  o 

empty  Let  r  c  R  C  H.  and  note  t!ta«  oggiasaHsy 
T-  T 

implies  r  As^  0  for  all  sc  S,  so  that  A  r  c  Pgf- 

7h**  S/A,  a°}  CS{A,  R)r=  P|. 

To  o 

Conversely*  A  r  cP|  for  some  r  c3  implies 
r°  As^  6  for  ail  ec  5,  which  gives  r^5 C  3°.  There¬ 
fore*  S(A,  R°)  D  5{A,  3)0  Pg,  and  the  proof  is 
complete. 

The  separating  planes  of  5(A,  R}  and  Pi  are  in 
one-to-one  correspondence  with  the  optimal  strate¬ 
gies  for  the  minimizing  player. 

Let  S°  be  the  set  of  optimal  strategies  for  the 
minimizer.  For  any  s°  C  S°,  we  have  rf  A  s°  *  0 
for  all  rc  R  and  *°  *  0  for  all  he  P|i.  Thus  ^ 
represents  a  separating  hyperplane. 

Conversely,.  wince  S(A,  R)  and  Pg  are  in 
contact,  any  separating  hyperplane  must  be  a  support 

hyperplane  to  both.  Let  s*  represent  such  a  hyper- 

T  T 

plane.  Then  r_  As*  s  0  for  all  x€  R,  and  n  s*i  0 

Pg  so  that 

by  suitable  scaling  we  may  take  jj#€  S.  But  this 

'T  o 

together  with  r  "A  £  0  gives  that  £*(  S  ,  and  the 
proof  is  finished. 


for  all  he  Pg.  The  latter  fact  implies  s*e 


In  general,  of  course,  a  game  will  have  a  non-^ero  value 


C4-  3S) 


,  ana  max  T 


A* 


ScfiM  a  mtarafi?^^  nebttatc.  =  s,  uda.  =  0,  H.2,  —  .9. 

—  U  £ 

Modify  t2v  Mt  {4-  33}  by  defining  at  new  Mt 


S(A,  K,  a}  =  {o'*f  E^1,  £  s  AT £  -  a  for  iojm  rf  R 

ud  ft.  =  ft,  0L  ~  0,  1=1,  2, - -  v} 

u  % 


(4-36} 


TSe  following  theorem  is  faaiuccaUl  for  oar  coldioii 
methods. 

X 

Theorem  4.  5:  For  the  game  £  A£,  rf  R  ud  *f  5,  the  value 
w  is  determined  by 

w  *  max  iftiPtn S{A,  R.  a)  ^  (4. 3?) 

o 

where  #  is  the  empty  set. 

Proof:  We  note  that  the  parauneter  a  has  the  effect  of 

tramslating  the  set  S( A,  R)  parallel  to  the  Sg-axis. 
Because  =  1  for  £C  R,  this  same  effect  may  be 
had  by  modifying  the  a^g  element  of  the  matrix  A. 

Let  us  do  so,  creating  the  matrix  Aff 


V 


(4.  38) 

i=j=0 

Otherwise 
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so  efcact 
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» 


t 


» 
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S(A.  R.  ef  =  SCA^  E.  OJ  *  SlA^  R)  *4.  i=*. 

If  «e  cg-iictr  the  |5rae  defiaef  by  A^,  S.  and  S, 
we  find,  since  *^=  1  for  jsC  S,  that 


min  max  T .  _ 
»C  S  r<  R— ■  o- ~ 


nun  max  T  _ 

.(Steal  A.-*-.-* 


(4. 40} 


From  tins  equation,  oar  proof  follows  readily.  If 
a  >  w.  Uses  the  value  of  the  game  with  matrix 
is  negative,  implying  that  there  exists  s  (S  suck 
that  A  s°  <  0  for  all  rc  R.  Since  he  Pg  means 

—  CT —  —  —  O 

***  <r 

h*fi  fl,  it  must  be  that  A.  r  i  Pi  for  all  re  R.  or 

equivalently  that  Pg  (1  S{A,  R,  tt)  =  I- 

On  the  other  hand,  a  *  w  implies  that  the  game 

(4. 40)  h z.3  a  non-negative  value.  Thus  there  will 

T 

exist  r° e  R  such  that  r°  Afl  *  0  for  all  se  S.  This 
implies  r°  e  P£.  so  that  P^f  n  S(A,  R,  a)  £  $> 
Therefore,  w  is  the  largest  value  of  a  such  that  the 
intersection  is  non-empty. 


From  (4.40)  we  see  that  as  a  result  of  our  notation  the  game 
with  matrix  A^  has  value  zero.  Theorems  (4.  3)  and  (4.  4)  can  be 
used  to  determine  the  optimum  strategy  sets  R°  and  S°  for  this 
game,  and  since  w  is  a  simple  translation  of  the  set  S{Aw>  R), 
for  the  original  game  with  matrix  A.  The  three  theorems  form, 
therefore,  the  foundation  of  a  solution  technique:  translate 
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Si  A,  R1  until  it  shares  only  boundary  points  with  P|.  Then  the 

points  of  intersection  determine  R°,  the  amount  of  translation  is 

the  value  of  the  game,  and  the  separating  hyper  planes  define  S°. 

4. 4  GEOMETRIC  AND  ALGEBRAIC  CONSIDERATIONS  FG-- 

35ZPES  - 

We  have  now  established  the  essence  of  a  solution  technique 
for  the  problem  of  finding  a  saddlepoint  in  mixed  strategies  of  the 
mean  of  the  payoff  J{u,  v)  in  equation  (4. 1).  In  the  remainder  of 
this  chapter  are  discussed  some  of  the  important  considerations 
in  applying  the  method,  including  algebraic  and  geometric  descrip¬ 
tions  of  some  of  the  sets,  numerical  approximations  to  solutions, 
and  actual  generation  of  the  required  probability  distribution 
functions.  Of  necessity  many  of  the  results  concern  special  cases 
for,  as  we  shall  see,  characterization  of  the  general  problem  is 
often  difficult. 

in  this  section  we  develop  more  detailed  descriptions  of  the 
sets  R  and  P^.  As  usual,  analogous  results  hold  for  S  and  Pg. 
Although  we  consider  mostly  special  polynomial  cases  and,  in 
fact,  show  the  difficulty  of  applying  our  methods  to  more  general 
problems,  we  must  remember  that  Theorem  4. 1  is  true  in  gen¬ 
eral  and  can  always  be  applied  to  generate  R  and  that  P^  can  be 
developed  directly  from  its  definition,  equation  (4.  30).  We  continue 
to  assume  that  =  1. 

Let  us  first  consider  the  set  R  under  the  condition  that  u  is 
one -dimensional  and 

r.(u)  =  ul  i=0,  (4.41) 


50 


I 


I  I 


This  corresponds  to  a  scalar  control  for  the  maximizer,  and  was 
considered  by  Karlin  and  Shapley  [,41],  whose  development  we  follow. 
For  convenience  define  vectors^ 

T 

tj  ~  (1#  tj i  tji  m»^}  ,  tj t [ 0,  1 }  (4.  42) 


and  note  that  is  the  set  of  all  such  vectors.  Assume  r°  belongs 
to  the  boundary  of  R,  and  let  h  represent  a  support  hyperplane  to 


R  at  r°.  Then 


T 

.  °  o  o 

h  r  =  0  r  = 


r°» 


(4.  43) 


h  r  *  0 


for  all  r€  R 


will  hold  for  this  h  .  But  by  Lemma  A, 

t+1 


(4.44) 


i=l 


for  suitable  t.  c  CD,  where 
—a  k 


fS 


=  1  and  0^  *  0,  i=l,  2, . .  .  ,fi  + 1. 


Substituting  (4.  44)  into  (4.  43) 


£,a- 


(4.45) 


i=l 


which  gives,  for  all  i  such  that  (X .  >  0, 


h  t.  =  0 


(4.  46) 
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X 

‘k 


since  t.C  C„C  R  implies  h°  t.  *  0  for  all  j.  Therefore,  we  may 
— J  K  ~  “J 

state  that  all  points ^t.  which  appear  nontrivially  (a.  >  0}  in  the  rep¬ 
resentation  of  r°  also  lie  in  the  hyperplane  represented  by  h°. 
Furthermore,  all  points  _r  which  belong  to  the  boundary  of  R  and 
which  are  convex  combinations  of  point sjL,  i=l,  2, . . , ,  k,  for  some 
k  *  (&  1  lie  in  the  hyperplane  defined  by 

T 

h°  _t.  =  0  j=l,  2, . . . ,  k  (4.47) 

With  the  above  basic  facts  established,  we  proceed  to  develop 
a  representation  for  h°.  The  requirement  on  h°  represented  by 
(4.  43)  implies  that 

T 

h°  _t  *  0  (4.48) 

for  all  tc[0,  l].  This  is  a  polynomial  in  t  by  definition  of^t,  and 
the  inequality  implies  that  any  root  of  the  polynomial  on  the  open 
interval  (0r  1)  must  be  a  double  root.  Thus  there  can  be  at  most 
[ip  ]  zeros  of  (4.  48)  in  (0,  1),  where  [x]  is  the  largest  integer  less 
than  or  equal  to  x.  The  roots  corresponding  to  t=0  and  t=l,  if  any, 
may  be  single  roots. 

We  notice  that  we  may  confine  our  attention  to  hyperplanes 
for  which  (4.48)  has  exactly zeros  in  [0,  1  ] .  This  follows  from 
the  observation  that,  for  example,  a  boundary  point  £  with  repre¬ 
sentation  in  terms  of  points  _t^,  i=l,  2, . . . ,  k<  [4£-]  can  be  repre¬ 
sented  in  terms  of  point s_t,,  i=l,  2, .  . . ,  [-^-J  when  the  additional 
points  are  given  weightings  =0,  i=k+l ,  . .  . ,  [  ]  .  This  is 


U 


If  I 


! 


(J 


(  ' 
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J; 


l 


equivalent  to  selecting  a  particular  support  hyperplane  when  there 
is  not  a  unique  support  hyperplane.  Thus  we  come  to  two  cases, 
depending  upon  whether  fi  is  odd  or  even. 


Case  1: 


(i  even.  The  hyper  planes  of  interest  will  have 
either  (a)  distinct  roots  in  (0, 1)  or  will  have 
(b)  -  1  distinct  roots  in  (0, 1)  plus  single  roots 

of  t=r0  and  t=l.  Therefore,  the  hyper  plane  will 
have  elements  implied  by 


Case  2: 


T  6  ? 

hlt  =  a  n  (t  -  t.) 

j=l  J 


a>  0 


(4.  49) 


M  _i 
2  x 

hTt  »  a  t(l  -  t)  D  (t  -  t.)2  a  >  0 
j=l  J 


which  result  from  simply  writing  out  the  polynomials 
in  different  form. 


H  odd.  The  hyperplanes  of  interest 


have  ^ 


distinct  roots  of  (4.  48)  in  (0,  1)  plus  either  (a)  a 
single  root  at  t=0  or  (b)  a  single  root  at  t=l.  The 
elements  of  h  will  be  implied  by 


h^t  =  at  n  (t  -  t.)^ 

j=l  J 


a>  0 


(4.  50) 


hTt  =  a(l  -  t)  II  (t  -  t.)2  a>  0 


In  either  Case  1  or  Case  2,  the  elements  of  h  may  be  found  in 

terms  of  the  roots  by  simply  matching  coefficients.  Therefore, 

h  may  be  found  explicitly  in  terms  of  a  s  A  of  parameters.  For  a 

* 

given  fi,  then  we  may  find  all  support  hyperplanes  to  R  by  con¬ 
sidering  both  type  (a)  and  type  (b)  hyperplanes  and  allowing  the 
roots  tj  to  vary  over  (0, 1).  We  shall  find  occasion  to  refer  to  the 
type  (a)  and  (b)  hyperplanes  as  lower  and  upper  support  hyperplanes, 
respectively.  As  a  memory  aid,  we  note  that  upper  supports  always 
have  a  single  root  at  t=l. 

To  clarify  the  ideas  developed  so  far,  we  present  a  simple 

2 

example.  Suppose  fi=2,  so  that  =  {_t 1 =  1,  =  t,  t^  -  t  ; 

t€  [0,  l]}  and  R  is  the  convex  hull  of  C^.  Then  for  any  h,  either 

hT_t  =  a(t  -  tj)2  tjC(0, 1) 

or 

h1^  =  at(l  -  t) 

These  equations  imply  lower  support  planes  of  the  form 

•? 

2t 

1 

and  upper  planes  of  the  form 


a>  o 
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We  may  now  use  our  knowledge  of  the  support  hyperplanes 
to  characterize  R  in  .  ./o  ways.  First,  since  &  is  convex,  it  is 
determined  by  the  intersection  of  the  half- space  defined  by  its 
support  hyperplanes.  Thus  we  may  determine  if  a  candidate  point 
£  belongs  to  R  by  checking  whether 


hJ(4J  0;  tj,  t2,  t  }  £*  0  all  ^€(0, 1) 

LTJ 

hjj ^ . j)  0  all  t^  €  (0,  l ) 


(4.51) 


where  h^  and  h^  are  the  explicit  representations  oi  the  relevant 
lower  and  upper  support  planes  in  terms  of  the  parameters  t., 
and  k(fi)  =  [^-]  for  fi  odd  and  k (fi)  =  ^  -  1  for  fi  even.  This  inter¬ 
pretation  is  exploited  in  the  next  section. 

Second,  and  perhaps  more  important,  the  development  of  the 
representation  of  h  suggests  what  the  boundary  of  R  looks  like. 
Examination  of  the  arguments  indicates  that  R  will  have  a  lower 
surface  consisting  of  all  convex  combinations  of  all  sets  of  exactly 
[^J  points jt,  t€(0,  1)  and,  if  fi  is  odd,  the  point  t  for  t=0.  Also,  R 
will  have  an  upper  surface  consisting  of  all  convex  combinations  of 
the  point  Jt=l,  k(fi)  points  generated  by  t  in  (0,  1),  and,  if  fi  is  even, 
the  point  generated  by  t=0.  Thus  if  fi=Z,  R  has  lower  boundary 

defined  by  points_t,  te(0,  1),  and  upper  boundary  defined  by  all 

T  T 

points  on  the  line  segment  from  (1  Q  0)  to  (l  1  1)  .  If  (i=3,  R 

has  lower  boundary  defined  by  all  points  on  the  line  segments  from 
T  2  3  T 

(1  0  0  0)  to  (1  t  t  t  )  and  upper  boundary  defined  by  line 
segments  from  (1  1  1  1)^  to  (1  t  t^  t^)^. 


The  above  discussion  is  easily  extended  to  the  case  of 
uncoupled  controls,  equation  (4.25),  since  by  the  use  of  Corollary 
4.2-1  it  is  known  that  R  is  a  cartesian  product  of  sets  R..  Thus  if 
each  function  r.  has  the  form 


k, . 

r.(u)  =  u.  ■ 3 
i-  J 


i=l»  2,  . . . , 


(4.  52) 


for  some  admissible  integers  j  and  k„>  0,  and  if  we  then  order 
these  functions  in  increasing  j  and  for  each  j  order  the  functions 
in  increasing  k.j,  then  each  R^  will,  except  for  the  constant  term 
implied  by  rQ=l,  be  like  the  set  R  for  the  scalar  control  considered 
above.  Explicitly  we  define 


R.  =  (xjxeE  ^  x.  =  f  ,  i=l,  2, 


“j- 


t  c  [  0,  1  ]  ]  (4.53) 


f  -  *  A  A 

so  that  we  have  R  =  1 1 J  xR.  x  R_,  x.  .  .  x  R  and,  by  implication, 
rn  ic.  m 

^  ]  (i.  =  (i.  (This  latter  assumption  is  made  without  loss  of 

j=l 

generality,  since  the  payoff  may  be  augmented  to  make  it  true.  ) 

U+l  A  A 

Then  it  is  easy  to  show  that  he  such  that  h  supports  R, 


h^£  a  0  all  R 

,  T  o  „  ~ 

hr  =  0  some  r  c  R 


implies,  for  j=l,  2,  . . . ,  m  and  proper  choice  of  hn  , 


hn  +  h.  r.  2  0 

°j  “J  -J 


all  r.C  R. 
-J  J 


,  ,  ,  T  o  n 

hA  +  h.  r.  =  0 
0j  -J-J 


O  A 

r°e  R. 
-J  J 
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where 


12  1  • 


ho.  =  h0 


A 

Hence,  the  hyperplane  must  support  each  of  the  sets  R.  individually. 

J 

Thus  the  character  of  each  of  the  sets  R.  is  established,  as  is  the 

J 

character  and  potential  parameterization  of  the  support  hyperplanes. 

f\ 

Of  particular  interest  is  the  fact  that  each  R^  has  an  upper 

and  a  lower  surface,  and  therefore  we  may  think  of  R  as  having  2m 

surfaces  and  of  there  being  2m  types  of  hyperplanes  supporting  R. 

Each  surface  and  each  hyperplane  type  can  be  explicitly  generated 

by  choosing  an  upper  or  lower  surface  and  the  corresponding  hyper- 

plane  set  for  each  R.,  j=l,  2,  .  .  . ,  m,  for  each  combination  of  "upper" 

J 

and  "lower.  " 

The  construction  of  R  when  the  controls  are  coupled  does  not 
appear  to  be  amenable  to  analysis  of  the  type  used  above.  A  pair 
of  simple  examples  will  help  illustrate  the  difficulties.  For 
example,  let 


r(u)  = 


u.c[0,  l],  i=l ,  2 


(4.  54) 


U1U2 
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Then,  as  sketched  in  Figure  4-1  for  the  cross-cection  r^  -  i,  we 
find  that  R  is  the  polygon  with  vertices 


r*l 


1 


I 


I 


9 


I 


f 


I 


I 


I 


(4,  58) 


and  the  surface*  of  R  are  {a)  the  surface  C^,  and  portions  of  the 
planes  (b)  r3  =  0,  (c)  rj  =  1,  (d)  r2  =  1,  (e)  ^  +  r2  -  r3  =  1. 

In  comparing  examples  1  and  2,  we  see  first  that  C„  is  not 

XV 

necessarily  a  boundary  surface  of  R,  although  it  may  be.  Further¬ 
more,  the  examples  do  not  even  have  the  same  number  of  sets  of 
support  planes,  since  Example  ]  has  four  sets  and  Example  2  has 
five  sets. 

Because  of  the  apparent  lack  of  common  properties  in  the 
two  examples,  it  appears  likely  that  construction  of  R  must  usually 
be  done  on  a  case  by  case  basis  using  Theorem  4. 1 .  Naturally, 
other  important  special  cases  besides  those  of  scalar  and  uncoupled 
controls  which  we  have  discussed  may  be  characterizable. 

At  this  point  we  turn  from  the  set  R  to  the  dual  cone  Pj^. 

Since  it  is  the  boundary  of  the  dual  cone  which  is  of  importance 
for  problem  solutions  (Theorem  4.  3),  we  shall  be  particularly 
concerned  with  generating  that  boundary.  We  establish  the  following 
theorem  as  being  particularly  useful  in  this  regard. 

Theorem  4.  6:  The  dual  cone  may  be  generated  using 
the  surface  C^,  that  is, 

P&  =  tx|xT£i:  0  for  a11  X€  Cr}  (4.  59) 
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Proof: 


Let  R  be  the  convex  hall  of  and  let 
dual  cone  corresponding  to  the  cone  generated  oy 
R.  Let  F*  denote  the  set  defined  by  the  right  hand 
side  of  (4.  59}-  Then  we  must  prove  that  =  P*. 
Since  C^C.Pp,  it  is  clear  that  the  definition  of  P* 
is  less  restrictive  than  that  of  so  that  C  P* 
Conversely,  let  he  P*.  By  Lemma  A  any 
point  r° €  R  may  be  represented  by  a  finite  convex 
combination  of  points  r.  of  C_,  i.  e. , 

•HI  It 


P£  be  the 


k  k 

y>i  =  l  «i>0 

i=l  i=l 

for  some  integer  k<  /*+l.  Furthermore,  any  point 
xc  P^  may  be  represented  as  a  non-negative  scalar 
multiple  of  some  point  r°cR,  £  =  Xr°.  Thus  for 
arbitrary  xc  P0  we  have  for  he  P*, 

k 

hTx  =  Xh.T  r°  =  X^  a.  iiT  (4.  60) 

i=l 

T 

Since  X  and  a.  are  non-negative,  and  li  £.  a  0 
becauee  he  P#  and  r^C  by  definition,  equation 
(4.  60)  ie  non-negative.  Therefore  P*  C  P^  and 
our  proof  is  complete. 
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Use  of  this  theorem  leads  to  a  general  technique  for  gen¬ 
erating  P* ,  one  that  will  be  used  for  the  analogous  set  Pj|  in  the 
next  section.  For  each  point  rc  CR,  we  may  generate  a  half- space 


H(r)  =  {xjxf  E**1,  xT  r  *  0} 


(4.61) 


The  intersection  of  all  such  half-spaces  constitutes  the  set  P£.  The 

T 

boundary  of  P can  consist  only  of  points  x  for  which  x  x_  =  0  for 
at  least  one  r€  C^,  although  the  existence  of  such  an  redoes  not 
guarantee  that  x  is  a  boundary  point.  The  generation  of  by 
this  approach  can  obviously  be  tedious. 

For  the  special  case  of  polynomials  and  scalar  controls, 
we  are  able  to  say  slightly  more  about  P^.  In  this  case,  we 
are  working  with  polynomials 


hTtiO 


(4.  62) 


2  U> 

where _t  =  (1  t  t  . . .  r  )  ,  since  is  defined  by  the  vectors  Jt, 
and  where  h€  P£.  To  be  on  the  boundary  of  a  vector  h 
must  have  a  corresponding^  Buch  that 


£t4  =  ° 


(4.63) 


However,  since  (4,  62)  must  hold  for  all  t  for  a  given  h,  it  follows 


that  if  t^€  (0,  1), 


(a) 


(4.  64) 
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(b) 


(4.  64) 


i  0 


As  we  shall  see  in  late?  sections,  the  relationships  (4. 63)  and  (4. 64a) 
can  be  used  to  find  h  in  terms  of  t€  (0, 1)  for  some  regions  of  P£.  The 
usual  extensions  to  include  end  points  t  =  0  and  t  =  l,  and  to  consider 
uncoupled  controls  using  cartesian  products  may  be  made. 

7/e  remark  that  since  noints  of  the  boundary  of  correspond 
to  support  hyperplanes,  the  discussion  at  the  beginning  of  this  section 
concerning  suppo.t  hyperplanes  for  R  can  in  principle  be  used  to  find 
P-^.  However,  considerable  additional  work  is  needed  because  that 
discussion  did  not  use  all  support  hyperplanes  when  a  choice  was  pos¬ 
sible.  The  unused  planes  were  unneeded  for  defining  R,  but  are 
essential  for  defining  Pj£.  Therefore  the  method  outlined  here  appears 
preferable  operationally.  Theorems  related  to  extending  the  hyper¬ 
plane  approach  for  scalar  controls  may  be  found  in  Shapley  and  Karlin 
[41]. 

4.  5  NUMERICAL  SOLUTIONS  AND  AN  APPROXIMATION 

TECHNIQUE - - 


Actual  solution  of  problems  of  the  type  considered  here  is 
difficult  at  best.  Dresher,  Karlin,  and  Shapley  C  38]  suggest  a  formu¬ 
lation  in  which  a  set  of  nonlinear  equations  are  to  be  solved,  and 
Dresher  and  Karlin  [54]  and  Karlin  [40]  propose  a  type  of  fixed-point 
mapping.  Both  methods  can  be  exceedingly  tedious  algebraically  even 
for  modest  problems,  and  numerical  approximation  does  not  appear  to 
be  straightforward. 

Any  two -per son  zero-sum  static  game  may  be  approximated 
and  solved  numerically  by  restraining  the  players  to  finite  control 
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sets  iuj,  u-,, . . . ,  u^}  and  {v^,  v^, ....  \^},  computing  the  payoff 


minimize r,  and  then  solving  the  matrix  game  B  =  {b,;l  for  mixtures 
of  the  given  controls.  This  brute-force  approach  tends  to  obscure 
g  any  subtleties  in  the  interactions  of  the  players  and  to  be  difficult 

to  interpret  relative  to  the  given  problem.  Its  sole  advantage  is 
its  generality. 

g  An  alternative  solution  method,  amenable  to  both  numerical 

approximation  and  analytic  interpretation,  may  be  developed  based 
upon  Theorem  4.5.  In  fact,  that  theorem  implies  that  we  may  solve 
I  our  game  problem  by  solving  the  following  mathematical  program¬ 

ming  problem: 

Problem:  Find  the  maximum  value  of  the  parameter  a 

9 

for  which  there  exists  a  vector  r€  R  such  that  (4.  65) 
Ag  r€  Pg,  where  Aa  is  defined  by  (4.  38), 

>  The  resulting  maximum  value  of  a  is  the  value  w  of  the  game  by 
Theorem  4.  5,  the  set  R°  C  R  of  all  vectors  r°  such  that  A^  r°  €  P|f 
represents  the  optimal  strategics  for  the  maximizer  by  Theorem  4.  3, 

>  and  separating  hyperpianes  of  Pg  and  S(A,  R  w)  (See  Equation  4.  36) 
yield  the  optimal  strategy  set  S°  for  the  minimizer  by  Theorem  4.4. 

For  simple  problems  the  constrained  maximization  problem 
!  (4.  65)  may  be  solved  fairly  directly,  as  is  demonstrated  in  the  ex¬ 

amples  of  Chapter  6.  For  more  complicated  problems  the  maxi¬ 
mization  becomes  difficult  to  visualize  geometrically  and  difficult 
1  to  manipulate  algebraically.  Approximation,  however,  is  straight- 
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forward,  for  since  the  sets  R  and  Pg  are  convex,  they  may  be 
approximated  by  a  convex  polyhedron  and  a  convex  polyhedral  cone, 
respectively,  to  any  desired  accuracy;  then  the  constraining  sets  are 
polyhedral,  and  problem  (4.  65)  may  be  solved  as  a  linear  program¬ 
ming  problem.  This  discrete  approximation  and  use  of  linear  pro¬ 
gramming  is  the  essence  of  the  method  which  is  discussed  in  some 
detail  in  the  remainder  of  this  section.  One  of  the  examples  in  Chap¬ 
ter  6  helps  illustrate  the  concepts. 

We  begin  by  demonstrating  the  nature  of  the  linear  program¬ 
ming  approximation  to  our  problem.  Let  R  be  a  convex  poiyhedror 
and  let  F|j|  be  a  convex  polyhedral  cone.  Then  the  requirement  x*  F 
can  be  expressed  by  the  requirement  that  jr  satisfy  the  linear  inequal¬ 
ities. 

i=l,  2, . . .  Nr  (4.66) 

for  some  finite  and  suitable  vectors 
be  expressed  by 

£?£*<)  i=l,  2,  . . . ,  Ng  (4.67) 


;  similarly  se  Pg  can 


for  a  finite  integer  and  suitable  a..  Note  that  we  have  used  our 
convention  Tq  =  1,  s^q  =  1.  Using  these  representations  and  the 
definition  of  Affl,  we  may  approximate  problem  (4.  65)  by  the  linear 
programming  problem: 

max  a 

a.  L 

subject  to  the  constraints 

~T  (4.68) 

L a  0  1=l»  2>  •  •  • »  nr 
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(4.68) 
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TP 

r  A  0  i=l ,  2, . . . ,  Nc 

This  approximation  applies  to  general  separable  games  of 
the  form  (4. 1),  since  no  special  properties  of  the  sets  R  and  Pg 
have  been  utilized. 

Creating  suitable  approximations  to  R  and  to  Pg  turns  out 
to  be  straightforward,  as  each  can  be  handled  in  either  of  two  ways. 
By  Theorem  4. 1,  R  is  the  convex  hull  of  the  surface  C^.  If  a  finite 
number  of  points  £_.€  are  chosen,  then  K  may  be  formed  as  the 
convex  hull  of  those  points,  and  the  "r.  are  then  the  representations 
of  the  hyperplanes  defining  R.  Under  these  circumstances  R  c  R. 
so  that  an  r°  which  is  a  solution  to  (4.  68)  is  an  admissible  moment 
vector  for  the  maximizer.  The  value  U°  may,  depending  uponPg, 
tend  to  underestimate  the  value  w  of  the  original  game. 

Forming  a  convex  hull  of  a  given  set  of  points  and  then 
finding  the  defining  hyperplanes  can  be  very  tedious.  If  the  support 
hyperplanes  of  R  are  known  parametrically,  as  discussed  in 
Section  4.  4,  then  the  for  equation  (4.  68)  may  be  taken  as  realiza¬ 
tions  of  those  hyperplanes  for  a  finite  number  of  parameter  choices. 
By  implication  R  will  then  be  the  intersection  of  the  half-spaces 
defined  by  those  hyperplanes  and  thus  RC  f.  This  approximation, 
while  easy  to  generate,  tends  to  overestimate  w,  and  it  may  also 
produce  an  optimal  strategy  vector  r°^  R.  This  latter  eventuality 
requires  an  additional  solution  step  in  order  to  find  r_#  »  r°,  r*e  R. 
Note  that  the  vectors  T.(  i=l,  2,  .  . . ,  ND,  represent  support  hyper- 
planes  to  IT  whether  the  approximation  to  R  is  internal  or  external. 


67 


This  will  be  useful  in  establishing  optimal  c.d.  f.  's,  as  is  shown 
in  Section  4.  6. 

If  the  boundary  points  of  P{S  are  known  explicitly,  then  by 

forming  the  convex  cone  of  a  finite  set  of  those  points  and  deter- 

mining  the  support  planes  £.,  we  may  generate  an  approximation 

J 

Pjfc  C  P|.  Because  of  the  nature  of  the  interaction  of  Pg  and  F, 
w  may  be  underestimated  when  problem  (4.  68)  is  solved.  Also, 
although  the  support  planes  £  of  Pg  belong  to  S,  the  support  planes 

<v  __ 

£j  of  may  not  have  this  property. 

An  alternative  method  of  creating  Fg  is  both  simpler  and 
occasionally  more  useful  than  the  method  above.  For  the  purposes 
of  solviug  the  linear  programming  problem,  we  are  interested  only 
in  the  support  planes  to  P^.  From  Theorem  4.  6,  the  boundary  of 
P|j  may  be  obtained  using  only  the  set  Cg.  Therefore,  we  may 
define  a  boundary  of  Fg  using  a  finite  set  of  points  of  Cgj  i.  e. ,  pick 
£.€Cg,  j=l,  2,  . ,  . ,  Ng,  for  use  in  (4.  68).  This  yields  PgC  F|  and 
a  possibly  overestimated  value  w.  Since  £^C  S  and  £.  supports 
P|5,  if  it  also  supports  S(A,  R,  w)  it  will  be  an  approximate  optimal 
strategy  for  the  minimizer. 

Because  approximations  to  R  and  Pg  are  reasonably  gener¬ 
ated  and  because  the  game  problem  may  be  reduced  to  a  maximation 
problem,  linear  programming  is  clearly  a  useful  tool  for  approxi¬ 
mating  the  value  of  a  game  and  the  optimum  moments  for  the 
maximizing  player.  The  strategies  for  the  minimizer,  which 
cannot  in  general  be  read  off  from  the  solution  of  (4.  68)  because 
they  correspond  to  separating  hyperplanes  rather  than  points,  can 
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be  found  simply  by  taking  the  negative  of  the  original  game,  so  that 
the  minimizer  becomes  the  maximizer.  Errors  due  to  approxi¬ 
mation  can  of  course  be  reduced  using  sophisticated  computer 
programming,  fine  granularity  in  the  approximations,  iterative 
techniques,  and  special  problem  characteristics. 

4.  6  COMPUTING  THE  CUMULATIVE  DISTRIBUTION  FUNCTIONS 
The  method  of  dual  cones  can  of  course  be  used  to  find 

T 

saddlepoint  solutions  for  given  general  problems  with  payoff  r_  As, 
where  £and  s  belong  to  compact  convex  sets  R  and  S,  respectively, 
but  ordinarily  such  problems  are  intermediate  steps  to  solving 
problems  with  payoff  J(u,  v)  of  the  form  (4. 1),  that  is,  with  separable 
payoff.  For  these  problems  it  is  ultimately  desired  that  optimal 
cumulative  distribution  functions  (c.d.f.  's)  F°(u)  and  G°(v)  be 
found  for  the  maximizer  and  minimizer.  In  this  section  we  consider 
the  problem  of  finding  the  function  F°(u)  corresponding  to  a  given 
z_€  R,  with  the  understanding  that  the  situation  for  G°(v)  and  s_c  S  is 
analogous. 

The  fundamental  relationship  between  r_  and  F(u)  is  given  by 
equation  (4.  7),  which  in  vector  form  is 

r(F)  -f  r(u)  dF(u)  (4.7) 

U 

where  jr(u)  results  from  the  defining  cost  function 

J(u,  v)  =  rf  (u)  A  sjv)  (4.  1) 
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As  in  Section  4.  2,  let  I  Q(u)  denote  the  degenerate  distribution 

u 

(4. 18)  for  which  the  entire  probability  mass  is  concentrated  at  u°, 
so  that 


1  o<*>  = 

u 


^  o 
u  s  u 

otherwise 


(4.69) 


where  the  vector  inequality  denotes  element  by  element  inequality. 
This  distribution  has  the  property,  if  U  is  an  open  set  in  U, 


(u)  = 


U 


0 


1 


u°  4  ucu 
u°(UCU 


(4.  70) 


Then  if  F(u)  is  a  pure  strategy  concentraled  at  u°  c  U,  i.  e. ,  if 

F(u)  =  I  o(u),  we  have  from  (4.  7)  that 
~  u° 

r(F)  =  r(u°)  (4.  71 ) 

Therefore,  as  can  be  seen  by  reviewing  the  definition  (4.  13)  of  the 

set  Cj^,  a  pure  strategy  generates  a  point  of  C^.  Furthermore,  a 

point  £° €  must  have  at  least  one  inverse  point  u° €  U,  implying 

that  there  is  a  u°  such  that  the  pure  strategy  I  (u)  generates  r°. 

u 

As  stated  by  Lemma  A  and  used  in  the  proof  of  Theorem  4,  1, 
any  point  r°C  R  may  be  written 


M+l 

iyi»i 

L  *2*1*1 

i=l 


M+l 

“i *  °'L 

TT  i=^ 

U.  €  U 


(4.17) 
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(4.19) 


and  this  r°  will  correspond  to  a  c.  d.  f. 

P°(u)  =  V  Oj  Iu  (u) 

”  -* 

Therefore,  any  point  £ ° €  R  may  be  generated  using  a  c.  d.  f.  which 
i§  a  finite  convex  combination  of  pure  strategies.  This  rather 
surprising  fact  is  the  basis  for  finding  c.d. f.  'a,  for  a  general 
method,  given  r°C  R  as  a  result  of  the  method  of  dual  cones,  is  to 
find  a  convex  representation  for  r°  in  terms  of  points  £.  €  C^, 
i=l,  2,  . .  „ , k  <  jt+1,  and  then  "invert"  the  functions  r(u)  to  find  the 
corresponding  pure  strategies  u^,  i=l,  2, . .  . ,  k.  The  pure  strategy 
set  \n,  i=l,  2, ....  k  for  a  c.  d.  f.  is  then  the  spectrum  of  that  c.  d.  f. 

Finding  a  convex  representation  of  r°  and  then  inverting  the 
functions  £{ix)  may  be  very  difficult  for  some  problems,  and  it  is 
then  necessary  to  try  a  more  direct  approach.  For  example,  one 
might  attempt  to  find  the  spectrum  lu.}  and  weightings  {ol}  as  the 
solution  of  a  programming  problem  of  the  type 


min 


i=l 


u.,  a.  i=l,  2, .  M+l 

— 1  X 


subject  tc  constraints 


a.  * 

i 


i=l,  2 . ji+1 


(4.72) 


0 


(Cont'd) 


(4. 12) 


u i=l,  2, 1 

(i.e.,  u.  €  [0, 1], 

j~l,  2, .  P. ,  m) 

where  the  minimum  distance  is  of  course  sero. 

If  the  functions  r{u)  can  be  inverted,  the  general  approach 
may  be  appropriate.  The  critical  part  of  that  approach  is  to  find 
the  spectrum  [u.}  or  the  equivalent  points  r.  e  C^,  The  weights  a. 
are  relatively  easy  to  generate  since  they  appear  linearly  and  must 
be  a  solution  of 


or 


(4.  73) 


For  the  special  case  of  scalar  controls  and  polynomial  payoffs, 
Karlin  and  Shapley  [41]  show  that  when 

rj(u)  -  u*  i=l,2,  ...,M 

and  a  point  r° €  R  is  given,  the  spectrum  of  r°  is  given  by  the  roots 
of  the  polynomial  functions  generated  by  determinants  of  the  type 
(for  p  even  and  .'°  belonging  to  the  lower  surface  of  R) 
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(4.  74) 


o  o 

r  r  . , 
ra  mi-1 
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r2m-l 
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•whiaxe  2m  =  fi ,  They  algo  derive  other  cases.  Their  results  are 
easily  t  jstended  to  multidimensional  uncoupled  controls. 

Another  way  to  compute  the  spectrum  can  be  used  when  the 
suj/pert  hyperpianos  of  R  arc  known.  In  our  discussion  we  assume 
that  r°  belongs  to  the  boundary  of  R,  which  is  for  our  purposes 
completely  general  because  the  compactness  of  R  implies  that  any 
r(  R  may  be  represented  in  terms  of  a  convex  sum  of  two  boundary 
points  of  R.  For  any  jr  belonging  to  tho  boundary  of  R,  we  know  that 

*  *  O 

there  is  at  least  one  eu opart  hyperplane  to  R  which  contains  t  . 

Let  k°  represent  such  a  hyperplane,  so  that,  since  =  1  by 
assumption. 


h°T  °  -  n 

h  r  =  0 


h°  riO 


o 

r  = 


r0* 


all  rc  R 


(4.  75} 


As  already  established  in  Section  4.  4  for  a  Jess  general  case,  for 
to  belong  to  the  spectrum  corresponding  to  r°,  it  is  necessary 
that 
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{4, 


=  0 


Therefore,  we  may  seek  the  spectrum  among  the  points  r.€  CD  for 

which  h°  r.  ~0  and  find  u.  as  the  inverse  of  r.. 

~  —}  -4 

An  imp*  rtant  property  of  this  hyperplane  technique  is  that  it 
is  a  useful  companion  to  the  method  of  linear  programming  used  to 
solve  the  dual  cone  problem.  The  solution  r°  of  problem  (4. 68)  of 
necessity  lies  on  at  least  one  face  of  H,  that  is,  at  least  one  of  the 
inequalities 


~  T  o  ^ 
£.  £  ^  0 


i=l,  2, . » . ,  Nj* 


will  in  fact  be  an  equality  for  some  index  j.  But  £.  represents  a 
support  hyperplane  of  H.  A  catalog  of  the  points  in  which 
generate  each  hyperpiane  will  immediately  reveal  which  such  points 
generate  and,  by  implication,  which  points  belong  to  a  spectrum 
for  r°. 

4. 7  SUMMARY 

At  t.iis  point  we  take  stock  of  our  accomplishments  in  this 
chapter.  For  the  static  game  problem  with  payoff 

J{u,  v)  =  r^ (u)  A  £(£)  (4. 1 ) 


where  u  and  v  belong  to  unit  hypercubes,  we  have  demonstrated, 
using  the  method  of  dual  cones,  both  a  solution  technique  and  an 
interesting  geometrical  interpretation  of  the  interactions  of  the 
control  spaces.  Of  particular  importance  are  the  facts  that  the 
game  problem  was  shown  to  be  solvable  as  a  constrained 
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maximization  problem  and  that  approximate  numerical  solutions  are 
possible  using  linear  programming, for  which  well-developed  com¬ 
puter  programs  already  exist.  We  also  characterised  some  of  the 
sets  involved  in  special  cases  and  indicated  how  the  optimal  c.  d.  f-  *s 
may  be  found. 

These  facts  are  the  foundation  for  the  consideration  in 
Chapter  5  of  multistage  games. 


CHAPTER  5 


THE  SOLUTIONS  OF  A  CLASS  OF  MULTISTAGE  GAMES 

In  this  chapter  the  problem  of  finding  a  aaddlepoint  for  the 
expect  e  J  value  of  the  coat  function  J  of  two-peraon  zoto-sum  N- stage 
game#  of  perfect  information  ia  diacuaaed.  For  the  games  of  inter¬ 
est,  the  cost  function  has  the  form 

N 

J  =  *N+i(£<n+1»  +]T)  gi(£(i h  u(i).  Wi}) ,  (3.  3} 

i=l 

the  dynamics  have  the  form 

£{i+I)  u{i),  v(i);  i) ,  *{i?  =  * ,  (3.  i) 

and  the  controls  u(i)  and  ^_(i)  arc  to  be  chosen  at  each  stage  from 
m-  and  n-dimensional  unit  hypercubes  U  and  V,  respectively-  Tne 
functions  and_f  are  polynomial*/. 

Two  variations  of  this  dynamic  game,  that  of  open  loop 
strategies  and  that  of  closed  loop  stxategies,  are  analyzed  using  the 
methods  of  Chapter  4.  This  is  done  by  first  showing  that  each  of 
those  games  can  be  reduced  to  certain  static  games  *n  which  the 
state  vector  £i*  a  parameter.  Then  continuity  properties  of  the 
optimal  solutions  are  established,  and  finally  stronger  results  for 
the  class  of  iinear-qr-dratic  games  are  derived.  As  indicated  in 
Charter  2,  existence  of  the  saddiepoint  optimum  was  established 
by  earlier  researchers,  who  will  be  cited  as  appropriate  in  the  next 
two  sections. 


Preceding  page  blank 
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5.  1  CLOSED-LOOP  STRATEGIES  AND  THE  PRINCIPLE  OF 

QSTIMaLITV - 

la  Chapter  3  the  multistage  game  with  closed  loop  strategies 
was  defined.  The  closed  loop  optimal  mixed  strategies  F°(u{i){  z{i)»  i) 
and  G°{v(i}J  z[i).  i)  and  the  value  function  w.{z(i»  were  defined  via 
equation  (3.  5).  By  simple  substitution  in  that  equation  it  is  clear 
that  the  value  satisfies  the  recursive  equations 

WN+1 =  %*1^ 


W;(z(i))  -J  j Cgi(z(i),  u.  v)  r  w.+J  (Hzii),  u.  v;  i})] 


V  U 


dF°{uU(i).  i)dG°(vU(i).  i) 


(5.1) 


£  v)  £%<*£>•  “*  %)  *  i»J 


The  fact  that  such  a  quantity  exists  and  satisfies  (5. 1}  has  been  used 
either  explicitly  or  implicitly  by  many  researchers.  Fleming  [53j 
states  the  necessary  facts  in  a  theorem  which  is  directly  applicable  to 
the  present  problem. 

Since  U  and  V  are  hypejcul>es,  tin  probierc  of  solving  (3. 1} 
for  each  i  and  for  fixed  z[i)  can  be  approached  by  the  methods  of 
Chapter  4  provided  that  the  quantity  to  be  optimized  is  separable  in 
ii  and  j/.  This  is  true  since  by  suitable  grouping  of  terms  we  may 
••/rite  (5.  1)  as 
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f'f'. 


{5.  2) 


»=<£>  =  (ufv)  D  rk<->  »*f®  *  ^ ^  A(z)  £{v}j 

~  “  k=l  j=l 


In  the  special  case  of  polynomials,  for  example,  the  functions  r^_, 
a^.,  *.  have  the  following  forms: 


,  ,  tjk  ^2k  *mk 
rk(u)  =  u,  u2  ...um 


ksl.Z 

-  non-negative  integer 
i=i, .  -  -  ,m 


*^nj 


,  j  2i  i 

8-{v)  =  V,  ■*  V_  .  u 
j  —  12  n 


J”1 ,  2, » - . ,  v 

on-ne; 
k=l, . . - n 


(5.  3) 


-  non-negative  integer 


a..fz)  =  C. .  z 
».s  ~  ij  1 


Hj 


i-l»  2, . «  •  ■  ft 
j=l,  2 . */ 

=  non-negative  integer 
k=l . X 


This  form  is  analyzed  in  detail  in  later  sections.  N rce  that  it  is  a 
parameterized  version  of  the  problem  of  Chapter  ' t . 

The  constraint  that  the  right  hand  side  functions  in  (5. 1)  be 
separable  is  important.  The  functions  g£z,  u,  v)  are  separable  by 
definition,  so  it  ic  the  term  j (f(z,  u,  v;  i)}  which  is  of  concern. 
Unfortunately,  as  demonstrated  in  an  example  in  Chapter  6,  this 
term  is  not  always  separable.  This  is  act  surprioing,  for  even  simple 
optimization  problems  with  parameters  often  have  inflection  points 
which  are  not  of  the  same  form  as  the  given  problem.  For  example, 
the  equation  of  the  maximum  in  t  of  the  quadratic  function 
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a2(z)  <  0 


f(z,  tj  =  aQ{z)  *‘a.l(x,)t-¥  a2(r)  t2 


is 


max 

t 


f(z,  t)  =  a0(z)  - 


2.  * 

4a2(zj 


Although  the  value  function  is  not  always  such  that 

w.+  j(f(z,  u,  v;  i))  is  separable*  this  term  is  separable  for  i=N  and 

for  linear-quadratic  problems;  the  latter  fact  is  proven  in  Sections 

5.4  and  5.  5.  In  addition,  it  may  be  separable  for  other  classes  of 

problems  and  for  special  regions  of  problems  for  which  general 

separability  does  not  hold;  this  requires  further  research  and 

detailed  analysis  of  the  functions.  Finally,  for  numerical  purposes 

it  should  be  satisfactory  to  approximate  w.+j{f(z,  u,  v;  i)}  by  a 

suitable  separable  function  and  to  apply  dvuamic  programming  and 

the  methods  of  Chapter  4  to  the  resulting  problem. 

5.  2  OPEN-LOOP  STRATEGIES  AND  BATCH  PROCESSING 

SOLUTIONS' - 

In  Chapter  3  the  polynomial  game  with  open-loop  strategies 
was  described.  In  tbis  section  we  reduce  a  stage  i  of  that  game  for 
which  z(i)  is  known  to  an  equivalent  single-stage  game  in  which  z{i) 
is  a  parameter  and  show  that  this  truncated  game  may  be  solved  ss 
a  batch  process.  The  reduction  is  essentially  algebraic,  and  the 
fact  that  the  resulting  form  is  identical  to  that  uoec  in  Chapter  4 
guarantees  a  saddlepoint  solution. 

Without  loss  of  generality,  but  with  a  considerable  gain  in 
notational  convenience,  let  us  consider  the  problem  for  i=i.  By 


i 
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i 


* 


i 
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repeatedly  substituting  (3.  t  j  into  (3.  3),  we  may  demonstrate  ex¬ 
plicitly  the  independent  variables  in  the  cost  function 

(5.4) 

J  =  gjUU!.  u(l),  v(l})  *  g2(fU(n.  u(l).  v(l);  1),  u(2>,  v(2))  <•  . . . 

*g k(£(£1...1U(1K  'Mi),  v(i);  u(N-l),  v(N-l);  N-l), 

u(N)e  v(N)) 

+  «CU.  vdi;  1)...),  u(N).  v(N);  N)) 

Because  all  of  the  functions  g.(. ,  - , . )  aml_f{. , . , . :  i)  are  polynomials 
in  their  i'  gumer.es  for  all  applicable  indices  i,  this  may  be  rewritten 
ae 

J  =  g(a{l),  u{l),...,u(N).  v<l),  v(2) . v(N))  '5.  5) 

where  g  is  a  suitable  polynomial  function  of  the  arguments.  We  may 
define  an  mN-vector  u  and  an  nN-vector  v 


noting  that  these  are  elements  of  an  mN-dimensional  unit  nypercube 
U  and  an  nN-dimensiona!  unit  hypercube  V,  respectively,  and 
rewrite  (5.  5)  as 

J  =  gU(l),  u,  v)  {5.  7) 
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where  we  have  simply  changed  notation  and  g  is  still  a  polynomial 
function  of  the  elements  of  the  various  vectors.  A  typical  urm 
of  (5.  7)  has  the  form 


c  • 


.  u 


’Nm 

Nm 


*1  *2 
V1  *2  ' 


where  c  is  a  constant,  z^  is  the  j  '  element  of  z(  1),  is  the  j 
element  of  u,  etc. ,  and  all  exponents  are  non-negative  finite 
integers.  Define 


r0(u) e 

A 

1 

ri(u)  = 

^2 

^Nm 

U1 

u2  . 

"“Nm 

«j(v)  = 

*2 

^Nn 

V1 

V2  ‘ 

• '  VNn 

a.  .(z)  = 
*J~ 

c  * 

^1 

Zl 

*2  < 
z2  ...z 

where  it  is  implicit  that  the  constant  c  and  exponents  £  depend  upon 
the  indices  i  and  j,  that  the  exponents  £  depend  upon  i,  and  that  the 
exponents  rj  depend  upon  j.  Then  we  may  for  suitable  finite  integers 
p  and  v  rewrite  (5.  7)  as 


-tt 

j=G  i=0 


aij1—  ri<“> 


*j(  V)  =  r 


(u)  A(z)  s(v) 


(5.9) 


In  the  vector-matrix  representation,  r  and  £  are  the  obvious  p+l  and 
v+1  dimensional  *ector  functions  and  A  is  a  (p+l)  x  (t'+l)  matrix 
function.  F >r  a  given  initial  condition  z,  (5.9)  is  precisely  the 
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problem  which  was  solved  in  Chapter  4.  It  is  noteworthy  that  it  is 
not  necessary  that  the  payoff  and  dynamics  functions  be  polynomials 
for  (5.  9)  to  result  from  the  substitutions  of  (3.  1)  into  (3.  3),  although 
the  class  of  polynomials  is  perhaps  of  widest  interest  to  us.  Cer¬ 
tainly  if  the  functions  are  separable  in  z,  u,  and  y  and  polynomial  in 
s,  the  payoff  can  be  written  in  the  separable  form  (5.9)  and  solved  by 
the  method  of  dual  cones.  Many  special  problems  may  also  have  this 
characteristic. 

That  (5.  9)  is  equivalent  to  (3.  3)  and  is  solvable  by  the 
methods  of  Chapter  4  is  easily  shown.  The  solution  of  (5.9)  is  a 
value  u  and  a  pair  of  mixed  strategies  F°(uj  z,  1)  and  G°(v|z,  1). 
These  are  equivalent  to  the  value  w^(z)  and  the  set  of  strategies 
F?(u(i)i  r,  i;  u(l),  —  ,u(i-l)}  and  G°{v(i)|a,  1;  v(l), . . . . _v(i - 1  )>, 
as  can  be  seen  by  substituting  (5.9)  into  (3.  6),  changing  the  order 
of  integration,  and  grouping  terms  to  get 


WjU)  =|^  J. .  .jrj (u)  dF^(u(N)|  z,  1,  u(l), . . . ,  u(N-l )) 

lu  U  _  .1 


.  .dF°(u(l)>z,  1) 


z)[  J. .  JsM  dG°  (v(N)J  z,  1 ;  v{  1 ),  . . . ,  y(N-l )) 
V  V 

...dGj(y(i)is,  1) 


J'tJm  dF°(u|£,  J  )|  A(z)  dC°(v\z,  1  )| 


U 
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Here  U  =  U  x  Ux.  .  .  x  U  V  =  V  x  V  x.  . .  x  V.  Integrability  is  no  prob¬ 
lem  since  we  may  restrict  the  c.d.f.'t  used  to  those  with  finite 
spectra  if  necessary.  Hence,  solving  (5.91  for  mixed  etrateg ie* 
on  u  and  v  is  equivalent  to  solving  the  open-loop  strategy  problem, 
and  the  former  may  be  done  using  the  method  of  dual  cones. 

The  mixed  strategies  F°(u|z,  1)  and  G°{v|  z.  l)  have  spectra 
consisting  of  control  histories  u  and  v.  If  it  is  necessary  to  find 
the  optimal  mixtures  of  controls  for  stage  i,  the  usual  steps  of 
integrating  over  all  admissible  controls  for  the  other  stages  must  be 
performed,  a  procedure  which  is  reduced  to  summations  because 
the  spectra  are  finite.  During  play  of  a  game,  when  only  a 
realization  of  the  control  strategies  is  needed,  this  step  may  be 
bypassed  by  choosing  a  control  history  u  (or  v)  in  a  random  manner 
and  then  picking  out  the  desired  elements  u(i)  (or  v(i)J. 

The  discussion  above  applies  in  a  natural  manner  if  the  game 
is  assumed  to  start  at  stage  i  with  initial  condition  z(i)  -  z.  There¬ 
fore  each  player  will,  at  any  stage  for  which  both  obtain  new  state 
information  z,  use  the  methods  of  Chapter  4  and  the  initial  condition 
£  to  generate  his  remaining  control  histories  and  then  select  his 
control  for  the  present  stage  using  a  random  choice  among  those 
histories. 

If  both  players  have  optimal  pure  strategies,  then  the  batch 
processing  method  may  also  be  used  for  the  game  with  closed-loop 
strategies.  This  fact  is  discussed  in  an  enlightening  manner  by 
Ho  [36],  It  does  not  hold  when  mixed  strategies  are  used,  however, 
as  the  rnader  may  demonstrate  using  simple  counterexamples. 


f 


( 
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Example  6. 1  is  a  good  one  on  which  to  base  a  counterexample. 

5.  3  CONTINUITY  PROPERTIES  OF  THE  SOLUTIONS  OF 

SEPARABLE  SAWS' - - 

The  exact  nature  of  the  dependence  of  the  solutions  of 
multistage  games  on  the  initial  conditions  £  varies  with  the  structure 
of  the  game  and  must  be  established  on  a  case  by  case  basis.  One 
particular  property,  namely  continuity,  can  be  shown  to  hold  in 
fairly  general  circumstances.  We  shall  discuss  continuity  con¬ 
ditions  for  the  value  function  and  for  the  strategies  in  the  present 
section  before  moving  on  to  establish  sharper  results  in  later 
portions  of  this  chapter. 

Using  our  previous  results  and  the  notation  established  in 
Sections  5. 1  and  5.  2,  we  know  for  some  polynomial  games  with 
closed  loop  strategies  and  all  with  open-loop  strategies  that  the 
value  function  w{z)  satisfies,  for  given  £ 


w(z ;  = 


min  max 
sfSrf  R 


jrT  A(z)  £  = 


max  min  T 
r(  R  sf  S- 


A(z)  8 


(5.10) 


where  R  and  S  are  convex  hulls  of  continuous  mappings  of  compact 
sets  U  and  V  or  U  and  V,  respectively.  This  representation  will 
prove  useful  in  much  of  the  discussion  to  follow. 

The  following  well-known  result  is  essential  to  understanding 
the  interactions  of  the  dual  cones  when  the  matrix  A  is  parameterized. 


Theorem  5.  1: 


If  the  elements  a. ^(z)  of  the  matrix  A(z)  are 
continuous  in  z  and  if  R  and  S  are  compact,  then 
w(z)  =  m,S^  r^A(z)  s  i®  continuous  in  z. 

—  T{  A  5  ••  —  — •  ” 
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Proof: 


Let  jSg  be  an  arbitrary  element  of  and  let  (z^) 
denote  the  set  such  that,  for  given  €  >  0, 


|a..(z)  -  aij(z^)|  <  C,  all  i,  j 


for  all  z  C  D,(ca).  Such  a  set  exists  since  the 
—  *  — *u 

elements  of  A  are  continuous.  Then  if  r°(z)  and 
s°{z)  are  optimal  moment  vectors  at  z. 


T  T 

wfz)  -  w(z^)  =  r°  (z)  A(z)  s°(z)  -  r°  (z^jACzg) 


*  L  0*)  £a(z)  -  4  Zg ) 3  s°( Zq) 


(5.12) 


“jClri(-’  SJ°!£0,! 

i.  j 


and 

T 

w(z)  -  a  r°  (z^tAfz)  -  Afz^}]  s.  (?„) 

(5.13) 

^  “€X)'ri(£0)  SJ°-)1 

i.  j 

which,  since  R  and  S  are  compact,  implk. 
iw(z)  -  w(z_g)j  s  k  ?  for  some  finite  k. 

Another  well-known  fact  is  that  the  limit  of  the  optimal 
strategies  of  a  sequence  of  games  is  an  optimal  strategy  for  the 
limit  of  the  games.  This  is  useful  when  payoff  functions  must  be 
approximated,  as  we  shall  see  in  Chapter  6,  and  for  proving  results 
about  continuity  of  optimal  strategies.  For  reference  we  formalize 
this  fact  in  the  following  lemma  and  indicate  a  brief  proof. 


r 


Lemma  B:  If  r^,  are  optimal  strategies  for  the  game 

T 

r  A  s  where  A  is  element-bv-element 
—  €  —  f  7 

u  n 

within  c  of  the  matrix  A 
n 


'  aij!  <  f  n 
n-. 
xj 


and  where  r  and  s  must  be  chosen  from  compact 

— ti  — n 

sets  R  and  S,  respectively,  then  there  exist  limits 

r°  and  s°  of  the  sequences  {r  }  and  fs  },  c  0, 
—  —  — n  — n  n 

v/hich  are  optimal  strategies  for  the  game  with 
matrix  A. 


Proof: 


We  indicate  the  proof  for  £  ;  analogous  results 
hold  for  s°.  The  existence  of  limits  fellows 
immediately  from  the  fact  that  {r^ j  is  an  infinite 
sequence  in  a  compact  set  R  and  must  therefore  have 
a  convergent  subsequence  with  limit  point  in  R. 

Call  thie  limit  point  r°.  Then  r°  is  an  optimal 
strategy  for  Player  I  for  the  game  with  matrix  A, 
for  if  it  were  not,  then 


mm  man  T  .  .  mm  o 

w  =  c  ,  _  r  As>  ,  ~  r  As 

s  €  S  r(  R  -  —  s  c  S  —  — 


nti  n  o  f.  j  .  *  . 
stSr  As|*d>0 


(5.14) 


for  some  5.  But  if  we  define 


a  7 


(5.  15) 


w 


n 


rmn 
s€  S 


max 
rt  R 


s 

— n 


then  (5. 14)  becomes 
T 


(5.16) 


I  rrun  o  .  i  -  i  i  .  i  min  T  . 

w  -  e  r  A  s  p  w  *  w  +  w  -  _  r  As 

sc  S-  — '  1  n 1  1  n  scS-n  — 1 


,  |  mm  T  .  mm  o  .  i 
+  „  „  r  As-  ,  _  r  As 

1  sc  S  — n  —  s€S—  — 1 


The  first  term  on  the  right  can  be  made  less  than 
for  large  enough  n  >  Nj  by  the  arguments  used  in 
Theorem  5.  1,  which  used  only  the  closeness  of 
the  terms  of  the  matrices  A(z)  and  AO^).  Similarly 
the  second  term  can  be  made  less  than  y  for  n  > 
by  arguments  using  closeness  of  the  matrices  and 

boundedness  of  S,  and  the  third  term  can  be  made 

6  o 

less  than  for  n  >  using  the  facts  that  £n  "*  £ 

and  that  S  is  compact.  Thus 

|  w  -  J  A  sj  £  c  (5.  17\ 


for  arbitrary  c  >  0,  contradicting  (5. 14). 


In  discussing  continuity  of  moment  sets  and  c.  d.f.  's  as 
functions  of  z,  the  following  version  of  the  definition  of  upper 
semieontinuous  mappings  is  useful. 

Definition  5-1:  A  point-to-set  mapping  0(x)  is  called  upper  semi- 
continuous  at  Xq  if  corresponding  to  any  open  set 
containing  0(x^)  there  exists  some  5  >  0  such  that 
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d(x,  Xq)  <  5  implies  </>{>;)  c  $  where  d(* ,  • )  is  a  metric 
defined  on  the  domain  of  0. 

Using  this  definition,  we  adapt  a  theorem  of  Karlin  [40]  to 
our  interests. 

Theorem  5.  2:  The  set  R°(£)  of  optimal  strategies  for  the  game 

T 

defined  by  r.  A(z)  s,  jrc  R,  S,  is  an  upper- 
semicontinuoue  ft* action  of  the  parameter  z. 

Proof:  Let  be  an  arbitrary  point  in  and  let  R°[Zq) 

be  the  optimal  moments  for  the  game  with  initial 
condition  z Suppose  H  is  an  arbitrary  open  set 
such  that  R°(z_)  C  H.  Let  D  (zn)  be  as  in  the 
proof  of  Theorem  5.  1  and  let  R^  be  the  set  of  all 
moments  r_€  R  which  are  optimal  for  at  least  me 
z€  D^(£q).  We  must  show  that  for  e  -•  0  sufficiently 
small,  R^  C  H. 

Suppose  the  contrary.  Then  there  exists  a 

sequence  [c^},  €n  “*  0,  such  that  Rf  H  for  all  n. 

n 

For  each  n,  choose  D  (z^)  with  corresponding 

-n  Cn 

optimal  strategy  r^  such  that  r^  4  H.  Then  we  have 

a  sequence  i  r  }  in  a  compact  set  R  such  that  r  {  H. 

— n  r  — n 

Thus  the  sequence  will  have  a  convergent  subse¬ 
quence  with  some  limit  point  r°  e  R,  r°  i  H.  But 
by  Lemma  B,  £°  is  an  optimal  moment  vector  for  the 
game  r^  s_,  and  therefore  r°  e  R°(z^)  C  H,  a 

contradiction  which  completes  the  proof. 
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At  this  point  we  go  beyond  previous  work  to  establish  a  form 
of  continuity  for  the  optimal  cumulative  distribution  functions 
F°(u|  z)  and  G°{vj  z).  Some  of  the  pitfalls  are  recognizable  in 
advance  and  must  be  carefully  circumvented,  in  particular,  we 
must  remember  that  (1)  the  optimal  c.d.f.  's  are  not  necessarily 
unique,  and  (2)  the  c.d.f.  's  are  discrete  over  the  sets  U  and  V, 
and  hence  continuity  in  is  much  like  the  continuity  of  the  zeros  of 
a  polynomial  as  functions  of  the  coeffioi  ..its. 

We  shall  develop  our  theory  using  the  support  hyperplanes  to 
R  at  its  boundary  points.  We  remember  that  by  assumption  r^  =  1 
for  r_i  R,  and  without  loss  of  generality  we  assume  that  bounded 
normals  of  hyperplanes  have  length  less  than  or  equal  to  unity. 


Theorem  5.  3:  The  set  H( r_)  of  the  bounded  representations  {;  e.  , 

A  A 

normals)  of  the  support  hyper  planes  to  R  at  r  is  an 
upper  semicontinuous  function  of  the  boundary 

A 

points  of  R. 


Proof: 


A  O  A  © 

Let  r^  belong  to  the  boundary  of  R,  let  H(£  )  be  the 
set  of  all  h  such  that  h^  r°  =  0,  h^  r  £  0  for  all 


rc  R,  and  |jhj|  ^  1  where  r°= 
open  set  containing  H(r°). 


r  1  n 


,  and  let  H  be  an 
We  assume  that  the 


contrary  of  the  theorem  holds  and  that  is  the 

open  set  of  all  r  in  the  boundary  of  R  such  that 

||r_  -  r°||  <  €  •  Then  if  i € n)  is  a  real  sequence, 

€  >  0,  c  "*  0,  we  have  that  r  €  D  has  limit  point 
n  n  — n  € 

o  n 

r  .  Furthermore,  if  H  is  the  set  of  all  h  which 
—  ’  n  — 
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Corollary 
5.  3-1: 


A 

support  R  at  at  least  one  point  of  ,  we  have 

n 

1$  as  our  contrary  assumption.  The  set  of  all 
hyperplanes  with  normals  of  unity  or  less  is  neces- 

A 

sarily  compact  for  the  compact  convex  set  R  and  in 

fact  is  a  portion  of  the  dual  cone  Pj^.  Choose  from 

tael*  H  a  vector  h  (  H.  Then  the  sequence  {h  } 
n  —ci  —ci 

has  a  limit  point,  call  it  h°,  such  that  h°  4  3.  But 
h  supports  R,  and  tnus  R.  Thus  we  must  have 

h°  r°  a  6  >  0 


Sine 


havf 


T 

h  r  =  0  for  some  r  €  D  for  each  u,  we 
— n  — n  — n  € 

n 


hTr 
— «  —n 


*  6  >  0 


But 


(5.  18) 


-  r  )  -  (h  -  h°)^  r  | 
— n  — n  —  '  -n 


£ 


,Oi  i  o 

h  I  M  £ 


which  can  be  made  arbitrarily  small  because  r  -*  r° 
and  h^  -  h°,  a  contradiction  which  completes  our 
proof. 


The  set  H*(:z)  of  the  bounded  representations  of  the 

A 

support  hyperplanes  to  R  at  the  optimal  strategies 
R°(z)  of  the  game  with  initial  condition  £is  an  upper 
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semicontinuous  function  of  z  provided  R0(jz)  con¬ 
sists  of  boundary  points  of  R, 

Proof.'  This  follows  immediately  from  Theorem  5.  3  by 

using  Theorem  5.  2  and  the  definition  of  H#(z}. 

Our  next  theorem  leads  to  a  characterization  of  the  con¬ 
tinuity  of  the  spectrum  of  F°(u|  z). 

Theorem  5.  4:  The  set  <p(h)  of  all  contact  points  of  the  support 

hyperp.lane  to  R  represented  by  h  with  the  set  CR  is 
an  upper  semicontinuous  function  of  h. 

Proof:  We  remember  that  R  is  the  convex  hull  of  CR.  Also 

we  remark  that  <p  may  or  may  not  be  connected.  We 
proceed  much  as  in  the  proofs  above,  taking  a 
sequence  h  of  normals  to  support  hyperpianos  to 
and  looking  at  their  sets<pn  of  contact  points  with 
Cn.  If  h°  is  the  limit  of  h  but  no  <p  is  contained 
in  the  open  set  (p  which  contains  </5(h°),  then  there 
must  be  a  sequence  of  points  €  CR,  4  <p(h°), 
such  that  r  -*  r°  4  cp(h°).  But<p(h°)  is  the  set 

T 

<p(h°)  =  {rj£C  CR>  h°  £=0} 
o  ^ 

and  thus,  since  It  supports  R,  we  must  have 
h°  r°  ?  6  >  0 

for  some  6.  This  situation  is  similar  to  that  of 


<n 


to> 


semiconrinuous  function  of  z  prc  -ided  R°(jz)  con¬ 
sists  of  boundary  points  of  R. 

Proof:  This  follow,  immediately  from  Theorem  5.  3  by 

using  Theorem  5.  2  and  -  he  definition  of  H*(,z). 

Our  next  theorem  leads  to  a  characterization  of  the  con¬ 
tinuity  of  the  spectrum  of  F°(uj  z). 

Theorem  5.4:  The  set  <p(h)  of  all  contact  points  of  the  support 

hyperplane  to  R  represented  by  h  with  the  set  CR  is 
an  upper  semicontinuous  function  of  h. 

Proof:  We  remember  that  R  is  the  convex  hull  of  C„.  Aiso 

we  remark  that  ip  may  or  may  not  be  connected.  We 
proceed  much  as  in  the  proofs  above,  taking  a 

A 

sequence  h  of  normals  to  s apport  hyperplanes  to  R 

and  looking  at  their  sets<p^  of  contact  points  with 

CA.  If  h°  is  the  limit  of  h  but  no '0  is  contained 
R  —  — n  n 

•  c 

in  the  open  set  <p  which  contains  <p(h  ),  then  there 

must  be  a  sequence  of  points  r  e  Cn,  r  4  p(h°), 

such  that  r  "*  r°  i  <p(h°;.  But<p(h°)  is  the  -set: 

*-,n 

T 

<p(h°)  =  {rj_r  f  CR,  h°  £  =  0} 
o  A 

and  thus,  since  _h  supports  R,  we  must  have 
h°  r°  >-  6  >  0 


for  some  6.  Ibis  rituation  is  similar  to  that  of 


t 


f 


f 


4 


l 


Theorem  5.  3  and  i»:  particular  to  equation  (5.  I£), 
and  a  similar  contradiction  arises,  completing  the 
proof. 

Corollary 

5.  4-1:  The  set<p  (r)  of  ail  contact  points  of  all  t,<i-  support 

hyperplanes  to  P.  at  £  with  the  set  is  an  upper 
semicontinuc  us  function  of  r. 

Corollary 

5.4-2:  The  set  <p  (z.)  of  all  contact  points  of  all  support 

A  Q 

hyperplanes  to  R  at  points  £<r  R{£)  with  the  set  C^ 
is  an  upper  semicontinuous  function  of  z,  provided 
that  R°(z)  consists  only  of  boundary  points  of  R. 

We  remark  that  Hurwitz's  theorem  gives  a  version  of  these 
results  for  the  special  case  of  zeros  of  polynomials  as  functions  of 
their  coefficients.  For  the  game  problem  this  i3  similar  tc  a  case 
with  polynomial  functions  and  scalar  controls.  Ncte  that  the 
corollaries  to  Theorem  5.4  require  that  all  support  hyperpLmes  of 
the  given  class  be  considered. 

There  is  one  more  step  before  establishing  our  final  result. 
We  remember  that  Lemma  A  Implies  that  for  ?:  €  R  it  is  possible  to 
form  a  finite  convex  representation  for  r_  in  terms  of  elements 
r.e  CD,  where  R  is  the  convex  hull  of  CD'  We  may  write  ouch  a 
representation  as 
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£4 

r  =  7  a.  r. 
—  La  i  -i 

i=l 


on  ^  0,  r.  e  CD 
x  — i  R 


i=l,  2,  .  .  .,M+1 


gv 

i=l 


We  are  interested  in  establishing  continuity  for  the  convex  coeffi¬ 
cients  a..  Each  coefficient  is  a  function  of  the  vector  r_  being  repre¬ 
sented,  of  the  spectrum  r^,  jn,,  .  .  used,  and  of  the  index  i. 

Thus  when  the  representation  of  r  is  not  unique  or  when  a  set  of 
vectors  _r  is  to  be  represented,  one  becomes  concerned  with  an 
infinite  set  of  such  functions  on.  Fortunately,  our  purposes  are 
served  by  a  more  modest  theorem  than  one  concerning  continuity 
of  this  set. 


Theorem  5.  5: 


If  a  sequence  (r(n)}  has  limit  r°,  if  each  r(n)  has 

U+l  ""  “ 


convex  representation 
I 

representation 


m  0  tsl  1  -* 

ion  7  on  r?  where  r. 
La  l  —a  — i 


Oc.(n)  r.(n),  then  r°  has 


(n)  -*  r.  and 


,  .  o  ,  i 
0^(n)  on  for  eac 


i=V  . 

ach  x. 


Proof: 


Since  each  Gn(n)  e  [0,  1  ]  and  each  r.(n)  €  Cn,  both 

l  —IK 

of  which  a;’e  compact  sets,  each  sequence  has  a 
convergent  subsequence.  (We  assume  implicitly 
that  the  element?  are  kept  ordered,  so  that  the  limits 
will  be  independent.  )  Denote  the  limits  by  Of?  and  r?. 
We  are  to  show  that  r°  o !°  r°.  Suppose  the 

contrary.  Then 
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£ 


i 

r. 

\  € 

s 

f, 

e 

Y 


C 


C 


c 


a  6>  0 


(5.19) 


i=l 


But 


21  U±1 

«?  rfH  =  ||r°  -  r(n)  + V  (a.(n>  r.(n)  rf 
i=l 

*  ||  r°  -  r(n)||  +j^  (0'.(n)||  r^n)  **  r?||  + 
i^l 


)ll 


+  Hr?  I|  |a.(n)  -a°j). 


<€1  +sC(0£i(n)€2  +  N^i11  € 

i=l 


for  sufficiently  large  n  and  arbitrary  positive 
Cgi  Since  0^(n)  and  ||r£||  are  bounded,  this 
contradicts  (5. 19)  and  completes  our  proof. 

Using  this  theorem,  we  are  able  to  develop  a  statement  of  a 
form  of  continuity  for  the  c.  d.  f,  's  in  terms  of  the  initial  condition 
£  of  the  state  vector.  To  do  this,  we  extend  the  concept  of  spectrum 
of  a  c.d.  f.  slightly  by  defining  generalized  spectrum  sets. 

Let  R°(£)  be  the  set  of  optimal  moments  for  the  maximizer 
for  the  game  starting  at  z.  Then  an  element  u  of  U  is  said  to  belong 
to  the  generalized  spectrum  at  z  if  there  exists  a  convex  repre¬ 
sentation  of  some  r° €  R°(£)  in  terms  of  boundary  points  of  R  such 

A 

that  at  least  one  support  hyperplane  to  R  at  one  of  these  boundary 


V 


points  contains  a  point  r€  which  is  the  image  of  ti  under  the 
transformations  r(u).  From  the  discussion  of  Section  4.  6  relating  V 

c.  d t  f,  *s  to  moment  vectors,  it  follows  that  the  spectrum  of  any 
optimal  c.  d.  f.  for  the  maximizer  at  z  is  contained  in  the  general¬ 
ized  spectrum.  The  generalized,  spectrum  thus  contains  all  vectors  £ 

u i_  which  might  belong  to  a  spectrum  of  an  optimal  c.d. f.  at  z_.  A 
generalized  spectrum  for  the  minimizer  may  be  defined  analogously. 

Using  the  definition  of  generalized  spectrum  and  the  results  0 

of  Corollary  5.4-2  and  Theorem  5.  5,  it  is  little  more  than  a 
restatement  of  those  results  to  obtain  the  following  important 
theorem. 

Theorem  5.  6r  The  generalized  spectrum  for  each  player  is  an 
upper  semicontinuous  function  of  z  .  For  given 

0 

spectrum  elements  in  this  set,  the  corresponding 
weightings  are  also  upper  semicontinuous  in  z. 

Loosely  put,  the  implications  of  Theorem  5,  6  are  that  if 
£“*  Zq,  then  in  an  upper  semicontinuous  sense 


The  upper  semicontinuity  is  required  primarily  because  of 
lack  of  uniqueness  of  solutions.  The  use  of  generalized  spectra  allows 
for  the  case  in  which  Ot^z)  0  as  zj*  Zq,  since  our  definition  of 
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spectrum  would  not  then  consider  u^(z^)  as  a  spectrum  point  of 
F^uIzqJ. 

These  concepts  of  continuity  are  important  in  understanding 
the  effects  of  parameterization  of  the  solutions  introduced  by  con¬ 
sidering  dynamic  games.  The  continuity  of  the  value  function  and 
upper  semicontinuity  of  the  optimal  moment  sets  are  particularly 
useful  in  visualizing  those  effects  and  in  problem  solving. 

5.  4  A  LINEAR  QUADRATIC  GAME  WITH  SCALAR  CONTROLS 

In  this  section-  it  is  demonstrated  that  the  value  function  for 
one  special  class  of  games  is  piecewise  polynomial  and  therefore 
thet  the  principle  of  optimality  may  be  applied  along  with  the 
parameterized  method  of  dual  cones  in  order  to  find  a  solution.  In 
the  course  of  the  demonstration,  the  use  of  Theorem  4.  5  is  shown, 
the  solution  to  the  problem  is  generated,  and  the  ideas  to  be  used  in 
the  more  general  problem  of  the  next  section  are  illustrated. 

Let  z(i)  be  an  /  -vector  with  stage  index  i  which  satisfies 

z(i  +  1)  =  Tj  jz(i)  +  u(i)  +  ^  v(i)  +  y.  (5.  21) 

where  T^  is  an  /  x  /  matrix,  a.  and  are  /-vectors,  u(i)  and  v(i) 
are  scalars  to  be  chosen  from  the  unit  interval  [0, 1  ],  and  y^  is  an 
/-vector.  Player  J  is  to  choose  rri.red  strategies  F?(u|z)  to  maxi¬ 
mize  the  minimum  expected  value  of  the  payoff  function 

J  =  zT(N+  1)£n+1  z(N+  D+e^+1  z(N+  1)  +€n+1+  (5.22) 


(Cont'd) 


N 

+]C  +  «f  *(*>  *  a(i)  *  $f  *«iJ  v(i) 


(5.22) 


i=l 


+  P.  u2(i)  +  Pi  u(i)  +  pi  u(i)  v(i)  +  Q.  v2(i)  +  q.  v(i)) 


where  the  t  xi  mat  rice*  S,  the  1-vectors  e.,  A.,  f.,  and  the 

X 

scalars  Pj,  p.,  Q.,  q.,  p.,  €^+j  are  known  to  both  players. 

Player  II  will  choose  mixed  strategies  G?(v|  z)  to  minimize  the 
maximum  expected  value.  A  special  case  of  this  problem  is  given 
in  great  detail  as  the  first  example  of  Chapter  6,  so  the  arguinent 
below  is  somewhat  abbreviated. 

We  proceed  by  induction  on  the  indices  i,  taken  in  reverse 
order.  Define  w.(z), 

wi<*)  =  (U(it  v(i))  [%  ^i  *  +  «iT  »  +  Pi  u2<i)  +  h  u<i>  +f{  UW  v<i) 


+  aT  zu(i)  +  zv(i)  +  Q.  v2(i)  +  q.  v(i) 


+  wi+l^Ti  £  +  “i  UW  +  ii  VW  +  &)] 


(5.23) 


i=l, 2, .... N 


in  the  usual  manner,  and  note  that 

WN+1  ~  £T^n+1  -  +  ^N+l  -  +  €N+1  (5.  24) 

We  aesurae  that  w^+j(z)  is  quadratic  in  ■&  in  some  region  of  interest, 

i,  c.  , 
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*,•1  <i>  *  iT  Di+li+SiT+l£t5i+l 


(5.25) 


and  attempt  t'  prove  that  w^U)  is  piecewise  quadratic,  that  is,  that 
Wj(z)  is  given  by  some  quadratic  form  in  £  for  any  region  of  E*. 
Let  us  make  the  following  definitions 

D  =  tJ  Dtj,  T.  +  £. 

1  l+l  1  i 


i  *  TiT  +  2Di+1 2?) +  2) 


P  =  a.  D. , .  a.  +  P. 
-hx+1-4  1 


P  ■  Dm  Z,  +  £1  2i  +  Pt 


Q  *  f£  D. 


i+1  $4  +  Qi 


1  *  2SiT  Dm  It +  &i  k  +  <k 


A=4+2T‘Di+l2. 


i.  =  ii+2Ti  Dt+1fii 
p « pi + ^  di+1  e* 

6  *  Ji+i  +  >fDiti  Zi +  £1  Zi 


Using  these  definitions,  along  with  the  assumption  that  D^4 ^  is 
symmetric,  (5.  23)  becomes,  under  assumption  (5.  25) 

w.(z)  =  ^  l  s^  D^s  +  dTz  +  Pu2  +  Qv2  +  pu  H  qv  +  puv 
+  6  +  A*  zu  +  £v] 


(5.26) 


(5.27) 
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0' 


We  may  write  this  in  vector  matrix  form. 


val 


Defining 


A(s)  * 


£TDr  +  dTJB  +  6  £Tz  +  q  Q 


ATa  +  P 


P 

0 


(5.  28) 


( 

£T9i+dT*+fi  iT*+q  Q 

l 

> 

<  Cl  u  u2] 

AYz  +  P  p  0 

V 

> 

•» 

P  0  0 

2 

V 

»  « 

4  ' 

O 


Cl 


c 


<J 


1 

u 

,»2j 


dF(u) 


(5.  29) 


*■/ 


LV 


dG(v) 


let  us  write  {5.  28)  as 


...  -  max  min  {  T  i 

wi«I*sr(P  icS  k  A(I> 

As  developed  in  S  action  ‘J>,  i,  for  this  problem  is  the  set 
{rj Tq  •  1,  rj  »  t,  r2  8  t‘,  t(  uO,  l]},  R  is  the  convex  hull  of  CR  and 
is  the  region  enclosed  by 


O 
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rQ  =  1;  r2  *  3fj  and  r 2  2  (5.  30) 

or  parametrically  by  the  curve  and  by  C*  =  {rjrQ  =  1* 

ri  =  rg  =  t,  tc  [0, 1]}.  S  is  analogous  to  R.  The  dual  cone  Pjfc  is 

defined  by  the  boundary  curves 


a. 

8q  «  0  for  Sj  2  0  and  s^  >  -  s2 

1 

b. 

s0  +  Si  +  s2  =  0  for  Sq  2  0,  Sj  <  -  2s2 

(5.31) 

c. 

2 

4SqS2  -  8^  a  0  for  «2  2  0,  0  2  s^  2  -  2s2 

The  sets  R,  which  is  the  projection  of  R  on  the  (r^,  r^)  plane,  and 
Pjfc  are  sketched  in  Figures  5-1  and  5-2.  The  set  Pg  is  the  space  in 
the  positive  8q -direction  with  boundary  given  by  (5.  31). 

At  this  point  we  introduce  a  parameter  a,  to  be  maximized 
according  to  the  dictates  of  Theorem  4.  5.  The  elements  of  the  set 
S(A,  R,  a)  defined  in  equation  (4.  36)  are  then  given  by  vectors 

3 

sc  E  such  that 

Sq  =  zT  Dz  +  dT  z  +  6  -  a  +  (AT  £  +  p)rj  tPr^ 


*1  =  i  *.+  *  +  P  r] 


(5.32) 


s2  =  Q 


Since  A(jz)  is  linear  for  any  z,  its  boundaries  correspond  to  those  of 
R.  Trivial  or  special  cases  such  as  P  =  0  will  not  overly  concern 
us,  since  the  methods  below  still  apply. 


The  implications  of  the  continuity  proofs  of  Section  5.  3  are 
that  S(A(z),  R,  w.(z))  moves  smoothly  over  Pg  as  £  is  varied. 

This  essential  fact  may  be  verified  here  by  substitution  of  numerical 
values,  and  is  used  in  the  discussion  below.  Basically,  if  Pg  and 
S(A(z),  R,  wJ[z))  are  in  contact  at  a  point  which  is  internal  to  one  of 
the  identifiable  boundary  regions  of  each,  then  as  z  varies  slightly, 
these  surfaces  will  remain  in  contact  although  the  exact  contact: 
poin-  may  move.  Therefore,  we  may  restrict  our  attention  to 
pairs  of  surfaces,  one  each  from  S(A{z),  R,  w^jb))  and  Pg,  in 
generating  w.(^).  We  will  simply  examine  the  possibilities 
exhaustively,  using  the  curves 


a. 


b. 


r2  =  rl 
and 

r2  “  rl 


(5.  33) 


of  R  and  the  surfaces  and  edge 


a.  yO 


b.  sQ  +  #l  +  s2  *  0 

c.  y0(  *1  +  *2  *  0 


(5.  34) 


d.  4yrV0 


of  Pg.  We  shall  find  the  value  w^(z)  and  the  optimal  mixed 
strategies  F°(u|  z)  and  G°(v|£)  for  each  possibility.  Where 
strategies  are  not  unique,  we  shall  simply  demonstrate  at  least 
one  optimal  strategy. 
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Cage  I.  The  plane  Sq  =  0  of  Pg.  All  support  hyper  planes  to 

this  surface  except  at  the  line  •}  +  »£  s  0  (Case  3)  have  repre- 

T  T 

sentations  (X,  0,  0)  ,  X>0.  This  implies  the  moments  (1  0  0) 

for  the  minimize.,  with  corresponding  pure  strategy 

G°(v|  z)  =  IQ(v)  (5.  35) 

From  (5.  32)  we  immediately  have,  using  (5. 33) 

a.  a  *  z*  Dz  +  d^z  +  6  +  (A** z  +  p)r.  +  Prf 

“  “  1  1  (5.36) 

b.  a  *  £T  D£  +  dTz  +  6  +  (ATz  +  p  +  P)rj 

where  r^  t  [0, 1  ].  Examination  of  coefficients  and  maximization  of 
a  over  leads  to  the  following  results.  (5.  37) 

i.  P  2  0,  ATz  +  p  +  P  *  0.  Then  r°  =  0,  F°(u|  z)  =  I0(u), 
and  w,(z)  a  a  =  zT  Dr,  +  dTz  +  6 

1  ••  max  —  •—  mm  mm 

ii.  P^O,  AT£+  p  +  P  *  0.  Then  r"  =  1,  F°(u|  z)  =  I^u), 
and  Wj(ib)  =  zj'  Djb  +  (£  +  A)^g  +  0  +  p  +  P 

iii.  P  <  0,  0  *  -  (ATz  +  p)/2P  *  1.  Then  r°  *  -  (ATz  +  p)/2P 
and  F^uljg)  =  1  Q(u).  Also,  the  value  is 


(Co.nt'd) 


iv. 


(5.3?) 


P  <  0,  -  (A  js  +  p)/2P  *  0.  Same  result  as  i. 


v.  P<  0,  -  (A  £  +  p)/2P  ^  1.  Same  result  as  ii. 


Case  2.  The  plane  s^  +  s^  +  S£  =  0  of  Pg.  In  this  case  we  find 
find  that  s°  =  (1  1  1)^  so  that  G°(v|  z)  =  I^(v).  Substituting  (5.  32)  in 
(5.  34b)  and  using  (5.  33)  gives  »5 

a.  a  =  zj  Dz  +  (d  +  +  6  +  q  +  Q  +  (A^ £  +  p  +  p)r ^  +  Pr  j 

b  a  =  z^  Dz  +  +  6  +  q  +  Q  +  (A1 *£  +  p  +  P  +  P)r^ 


Once  again  we  maximize  a  over  r ^  €  L 0,  1  ]  to  get  the  following 
resuits.  (5.39) 

i.  P  *  0,  ATz  +  p  +  p  +  P*0.  Then  r°  =  0,  F°(u|  z)  =  I0(ul 
Wj(z)  =  Z?  Dz_  +  (d  +  £)  *£  +  6  +  q  +  Q 

ii.  P  *  0.  ATz  +  p  +  p  +  P*  0.  Then  r°  =  1,  F°(u|  z)  =  l{(a\ 

W-(z)  =  ZT  D_Z  +  (d  +  §L  +  +  6  +  q+  Qi  P  +  P+  P 

iii.  P<  0,  0*  -  (AT£+p  +  p)/2P*  1.  Then 

r°  =  .  U?£±V±Eh ,  F°(u|z)  =  I  Q(u)  and 
T  rl 

w.(z)  =  zT(D  -  *  (d  -  + 

+  a  +  q+  Q  - 

(Cont’d) 
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<5.  39) 


T 

iv.  P  <  0,  -  ^  '  *  0.  Same  result  as  i 


P  <  0, 


IP 
(AA£+p+p) 


TF 


^  1.  Same  result  as  ii. 


fv 


Case  3.  The  line  Sq  =  0,  ~  ®  °*  W e  note  that 

this  case  applies  only  to  s^  £  0.  It  is  mo/e  complicated  than  those 
above  because  the  separating  hyperplanes  of  the  two  sets,  which 
imply  the  strategy  for  Player  II,  no  longer  are  in  one-to-one 
correspondence  with  the  points  of  contact.  Thus  we  must  examine 
the  slope  at  the  contact  point  of  the  boundary  of  S(A(z),  R,  w.(jz)). 

The  condition  s^  +  ~  0  gives 


r°  =  -  jB  +  q  +  Q )!p 

which  must  be  substituted  in  the  appropriate  equation  of 

a.  a  -  sj  Djz  +  d^iz  +  6  +  (AT £  +  p)r ^  +  Pr  j 

b.  c:  =  z?  D£  +  d^js  +  6  +  (aF £  +  p  +  P)r j 


(5.40) 


(5.  41) 


provided  of  course  that  r®€  [0, 1  j,  a  necessary  condition  for  Case  3 
to  hold  all.  The  following  cases  may  be  found. 


i, 


P  Js  0.  Then  F°(u|  z)  =  (1  -  rj)  IQ(u)  +  rjl^u);  that  is, 
the  maximizer  uses  a  mixed  strategy  of  points  u  =  0 
and  u  =  1,  a  condition  which  is  clearer  if  Figure  5-1 
is  examined  and  the  discussion  of  Section  4.  6  is 
remembered.  It  can  be  seen  that 


O 


0 


0 


0 


0 


0 


0 


ii. 


w.(z)  =  £  (D 


p“>£+  <i 


S£±BL  .  ta±Q?ya  +  6 


_(p-fP)(q;-Q) 

P 


This  result 8  simply  by  substituting  (5.40)  into  (5.41b). 
Using  (ii.  32)  we  find  that,  by  eliminating  with  3r2  =  rl* 


9s0  z  +  p  +  P 
= - P - - 


(5.  42) 


Equation  (5.  42)  aiong  with  the  condition  Sq  =  0,  8j  +  s2  ”  0 

o  S80  dso  T 

can  bn  used  to  show  that  s°  «  (1  -  —  -  -g~-  — ) x  or 

that  G°(vj  z)  =  (1  +  -g~)  Iq( v)  -  *-—•  I1(v).  If  r°  =  0  or 
r°  =  1,  this  result  may  not  give  a  separating  hyperplane; 
one  of  the  extremal  strategies  I0(v)  OI  *j(v)  i®  then 
optimum,  although  not  necessarily  uniquely  so. 


Ps  0,  In  this  case  F°(u|  z)  -  I  q(u)  where  r?  is  given 

-  ri  i 

by  (5. 40).  Substituting  into  (5>  41a)  yields 
T 

Wjta)  =  sT(D  -  +  -7  ilT  )£  (5.  43) 

P 

+  (d  -  -g- 1  -  ia±SL  A  t  JislflP  £)T, 

+  6  “  (q  +  Q)  +  -S*  \q  +  Q)^ 

P  PZ 


Since  -  Q  at  the  point  of  contact,  »j  =  -  Q.  There* 
fore  we  have 
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0 


p.  ,t  £VjJ*QjJ£3-  ,5.44) 

5so 

and  G  (v|  z)  depends  upon  as  in  part  i  of  this  case. 

2 

Ca^e  4.  The  surface  4SqS2  -  s^  =  0  of  Pjfc.  In  this  region 
we  concern  ourselves  with  tangency  oi  S(A(z),  R,  a)  and  Pg.  Note 
that  at  a  point  of  Pg  in  this  region  there  can  be  only  one  support 
hyperplane,  namely,  that  corresponding  to 


f) 


0 


0 


where 


o 

s  = 


5so 

a®n  2 


d80  +81 

^7 


(5.45) 


(5.46) 


0 


6 


0 


Using  (5.  32)  we  find  that 


d80 

*7 


i  ■  +  q  +  pr° 
- 2  : - 


(5.47) 


0 


where  r°  is  the  first  moment  of  the  maximizer's  optimal  strategy. 
Substituting  (5.  33)  and  (5.  32)  into  the  equation  (5.  34)  for  the  surface, 
we  find 


0 


'W 


V'-gi 


I  I 


I  ' 


% 

A* 


a. 


Of  =  zT  Dz  +  dT£  +  6  +  (ATz  +  p)rj  +  Pr2  -  ^(zT&Tz 

+  2p  rj  ^T£+  2q£Tz  +  q2  +  2pq^  +  p“r2) 

T  2 

=  £T(D  -  £  +  (d  -  -jjjj-  £_)T£  +  (6  -  -^-) 


^  i*T£  ri  +  tP  *  2Q**  ri  +  *P  *  ^q)  ri 


(5.  48} 


T  2 

b.  a  =  zT(D  -  ^j-)£+  (d  -  2^-4.)T£+  (6  - 

+  (AT£--^;iT£+P-|§-+  P)*!  --fs^ 


Implicit  in  Case  a.  is  that  P  <  0,  while  the  contrary  holds  in  Case  b. 
Cases  a.  and  b.  are  so  similar  in  analysis  that  we  shall  treat  them 
together,  writing 

(5.49) 

®  s  -  *^p)£  +  (d.  "  £  +  (6  “  *^*)  +  +  yrj 

where  the  definitions  of  x  and  y  should  be  obvious.  Then  we  have  the 
following  situations. 

i.  If  y  *  0,  x  +  y  <  0.  Then  r°  *  0,  F°(ul  z)  =  Xq(u), 

G°(v|  z)  -  I  Jv)  where  s°  =  -  ^ z  +  q)/Q,  and 

wi(£)  =  £T(°  -  ) £  +  (d  -  |.)T£  +  (6  -  "^f) 

ii.  If  y  *  0,  x  +  y  >  0.  Then  r °  =  1,  F°(u|£)  =  I^(u), 

G°(vj  z)  “  I  (v)  where  s°  =  -  z  r  q  +  p)/Q  and 
ac 

(Cont'd) 
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0 


T  2 

w.(z)  -  SBT(D  -  +  <d  -  ^  l)Tz  +  (6  -  -|§)  +  X  +  y 


iii. 


If  y  <  0,  and  .0  <  jS-  s  I.  Then  r°  -  ^L; 

3*°(u|  z)  *  I  D(u)  if  P  <  0  (i.  e. ,  Case  a.  is  pure 

j*  " 

strategy)  an&  F°(u|  z)  =  (1-  r°)  I0(u)  +  r°  I^u)  if  P  *  0. 
The  minimize r  uses  the  pure  strategy  G°(v|  z)  =  I  (v) 


where 


s°S 


- n — 


0 


and  the  value  is 

j  2 

wi(i)  =  £T(d  “  “jj J5“)  £  +  (d  ”  $_)T£  +  (6  ~  )  - 


0 


iv.  If  y  <  0  and  <  0*  then  r °  *  0  (because 

r° €  [0, 1])  and  the  optimal  strategies  and  value  of  i. 
occur, 

v.  If  y  <  0  ind  -g—-  >  1,  then  r°  =  1  and  the  results  of  ii. 
apply. 


%) 


9: 


We  have  demonstrated  by  exhausting  the  possibilities  that  £> 

Wj(z)  is  piecewise  quadratic  if  w^+j(z)  is  quadratic,  and  by  extension 
we  see  that  if  ’v.  +  j{z}  is  piecewise  quadratic  then  Wj(z)  will 

0  , 
! 
i 
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> 


» 


I 


> 


t 


i 


i 


i 


i 


► 


necessarily  be  so  also.  This  completes  our  induction  step  and  shows 
that  for  the  linear  quadratic  game  with  scalar  controls  the  principle 
of  optimality  and  the  method  of  dual  cones  may  be  applied  to  arrive 
at  a  solution. 

A  trio  of  remarks  may  be  made  about  the  constructions  above. 
First,  we  have  not  been  particularly  concerned  about  the  lack  of 
uniqueness  of  solutions,  a  fact  that  may  seem  to  obscure  the  semi¬ 
continuity  of  the  solutions.  Nevertheless,  the  semicontinuity  holds. 
Second,  we  note  that  the  optimal  first  moments  are  either  extremal 
elements  of  1 0, 1]  or  are  linearly  related  to  z.  Finally,  it  can  be 
observed  in  the  solutions  that  for  Player  1  to  have  optimal  mixed 
;ti  itegies,  it  is  necessary  that 

P  =  Ct*  Di+1  +  Pj  2  0  (5.  50) 

For  the  minimizing  player  to  have  such  strategies,  the  condition 

Q  =  ^ Di+l£i+  QiS  0  (5,51) 

must  hold.  These  conditions  are  of  course  not  sufficient. 

5.  5  A  UNEAR  QUADRATIC  GAME  WITH  VECTOR  CONTROLS 

If  the  controls  of  Section  5.  4  are  vectors  rather  than  scalars, 
then  the  value  function  is  still  piecewise  quadratic.  This  is  a  fact 
of  fundamental  importance,  fer  it  is  a  characterization  of  the 
solution  for  a  common  class  of  games.  It  is  proven  in  this  section 
by  a  technique  which  is  in  the  spirit  of  Section  5.  4,  but  which  is  of 
necessity  not  exhaustive  in  nature.  The  approach  is  to  show  that 
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for  an  arbitrary  pair  of  surfaces  one  from  Pg  and  the  second  from 
S{A(z),  R,  a),  the  value  function  implied  by  use  of  Theorem  4.  5  is 
quadratic  in  z.  Piecewise  quadraticity  follows  immediately.  Because 
of  the  nature  of  this  proof,  it  is  concerned  only  with  the  form  of  the 
solution,  although  the  techniques  might  be  used  to  find  the  exact 
solution  if  that  were  desired. 

T a -5  problem  of  concern  to  us  has  dynamics  given  by 


z(i+l)  =  Tj  s{i)  +  (L  u(i)  +  0.  v(i)  +  £ ji ) 
and  payoff  function,  for  the  truncated  game  starting  at  stage  j, 
Jj  =  £T(N+1)4+1  b(N+1)  +  e^+l  z(N+l)  +  eNifl 
N 

+  S  (~T(i)  4  -(i)  +  £-(i)  +  -{V>  “(i) 


+  /  €*  v(i)  +  uT (i)  Pj  u(i)  +  £^u{i)  +  vT(i)  Q.  v(i) 


+  <£  v(i)  +  uT (i)  p.  v(i)) 


(5.  52) 


(5.  53) 


where  z_  is  an  4-vector,  u  is  an  m-vector  to  be  chosen  from  a.  > 

m-dimensional  unit  hypercube  U,  v  is  an  n-vector  to  be  chosen 

from  an  n-dimensionai  unit  hypercube  V,  and  T^,  a.,  /).,£, 

Pj,  Q.,  p j  are  known  matrices  of  suitable  size,  jr,  e^,  j>.,  ^  are 

known  vectors,  and  a  scalar  constant.  We  are  concerned 

with  proving  that  the  value  w.(z)  is  piecewise  quadratic  in  z,  where 

J 
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(5.  54) 


Wj  — *  '  (u{i),  v(i),  i— j» ....  N)^j^ 

Note  that  w^+j(z)  is  indeed  of  the  required  form.  For  our  induction 
hypothesis,  we  assume. that  w^+j(z)  is  piecewise  quadratic,  so  that 


(5.  55) 


for  some  Di+j,  <h+j»  6.+j  in  a  given  region  of  interest,  and  prove 
that  w j{z)  is  also  of  this  form.  Using  the  principle  of  optimality 

w.(z)  =  .va*  .  [z^£  z  +  eT z  +  z^A  u  +  zT  £.v  +  uT  P.  u 
i  —  (u,  v)  —  i  —  — i  —  r*  i—  —  ’i—  —  i  — 

(5.  56) 

+  £TH. +  ZT  %  1  +  £Tv  +  uTp  v  +  wi+1(T.z  +  a.  u 

+  ^  v  +  2^)3 


which  after  substitutions  and  definitions  in  a  manner  similar  to  that 
of  the  previous  section  gives  the  form' 

*lW  -  toil  UTD£+dTi+uTPu+vTQv  +  ETu+  aTv 

(5.  57) 

+  u  v  +  6  +  zj  A  vi  +  z?  £  v ] 

At  this  point  we  define  functions  £(u)  and  sjv)  and  a  matrix 
A(z)  so  that  (5.  57)  may  be  put  in  standard  form.  For  clarity  of 
presentation  we  utilize  notation  which  is  somewhat  more  appropriate 
to  matrices  than  to  vectors  in  that  double  subscripting  of  vectors 
is  used.  To  be  consistent  with  our  previous  work,  however,  we 
continue  to  work  with  vectors  and  matrices  rather  than  create 
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awkward  def  initions  for  come  of  the  seta  involved.  Thus  we  define 


i=l*  2, . . . ,  m 


u0»l 


j=0, . . .  ,m 


s..(v)  =  v.  v.  i=l,  2, . . . , n 


i  j 


j=0,  it  i+ 1)  •  * » }  n 


v0*1 


and  we  define  r  and  s  as 


m-1,  m 


s  , 

n>*i,  n 


) 


The  ordering  of  the  components  of  £  and  £  will  not  generally  be  of 


significance  to  us.  With  these  definitions,  (5.  57)  may  be  rewritten  as 

(5. 60) 

z^D£+d^z  +  6  ^11^22'** 

^n-  ,n+\n-l^ 


,  .  max  min  T 
wi  —  =  r c  R  B€  S  — 


+£ 


11 


22 


mm 


P  +  P 
*12  21 

P  +  P 
*13  +  *31 


m-l,m+ 


0 


O 


t 


$ 


I 


I 


In  this  equation,  we  have  used  as  usual  the  definitions 

r(u)  dF(u)  , 

U 

j»(v)  dG(v)  , 

V 

R  is  the  convex  hull  of  C^,  S  is  the  convex  hull  of  Cg,  and  so  on. 

The  proof  proceeds  in  three  major  steps.  First  it  is  argued 
by  using  our  knowledge  of  simple  cases  and  of  the  nature  of  a  solution 
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to  the  problem  that  the  boundaries  of  R  and  S  must  have  a  certain 
form.  Then  we  show  that  the  boundaries  of  Pg  must  have  certain 
properties.  Finally,  using  this  knowledge  of  R  and  P|,  the  form  of 
w.(z)  is  discussed. 

In  developing  a  structure  for  the  boundary  of  R,  we  shall 

exploit  the  fact  that  the  competitive  element  of  the  game  appears 

only  through  the  p  matrix  in  the  form  of  terms  p.  .u. v..  Thus  only 

the  first  moments  r^,  i=l,  2, . . . ,  n  must  be  chosen  with  the  opponent 

in  mind.  The  terms  r.^,  j^O,  may  be  chosen  to  optimize  the  payoff, 

consistent  only  with  the  constraints  imposed  by  the  value  of  r.Q. 

Since  we  know  from  the  scalar  control  case  (Figure  5-1)  that 
2 

rii  *  ri0  an<*  rii  *■  ri0  are  re<lu*ret*  *or  anY  realizable  distribution, 
and  since  r.^  must  be  chosen  to  have  minimum  or  maximum  value 
depending  upon  the  algebraic  sign  of  pu-  it  follows  that  the  boundary 
regions  of  R  have  r^  related  to  r^  by 


r 


0 


€ 


0 


a.  min  r^  =  rJ0 


b.  max^rjj  =  r^Q 


(5,61) 


D 


We  may  argue  in  a  similar  manner  concerning  the  cross-  0 

coupling  moments  viy  j,  i/0,  j^i.  Two  separate  possibilities  arise 

in  this  case.  If  either  r^  or  r^  is  associated  with  a  pure  strategy* 

then  r^  =  r^r^  =  E[u.Uj].  If  both  control  elements  are  associated  {) 

with  mixed  strategies,  then  using  the  argument  about  the  possibility 

of  choosing  r. .  independently  of  the  competition  yields  that  r. .  should 
ij  ij 

be  either  minimized  or  maximized  within  the  limits  of  the  chosen  r> 
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first  moments  and  r^.  Some  thought  and  an  examination  of 
Figure  4-1  reveals  that  the  maximum  value  of  r.^  is  given  by 


max  r..  =  min  [r.ft,  r.A] 
xj  xO  jO 

and  the  minimum  by 

min  r„  =  max  [0,  r.Q  +  r.Q  -  i] 


(5.  6?.', 


(5.  63) 


This  can  also  be  shown  by  considering  possible  bivariate  distribution :i 

on  u.  and  u., 
i  J 

To  find  Pg,  we  exploit  Theorem  4.  6,  which  says  that  the 
boundaries  of  Pjjj  may  be  generated  using  the  pure  strategies 
represented  by  Cg.  A  pure  strategy  for  the  minimizer  will  have  nfc 
elements,  say  the  first  0  s  n^  £  n,  chosen  from  (0, 1),  n^  elements, 

0  s  tig  s  n,  with  value  zero,  and  n^  elements,  say  the  last 
0  *  nj  £  n,  n^  =  n  -  n^  -  n^,  with  val\ie  one.  Let  the  region  of  Cg 
with  this  characteristic  be  denoted  Cg,  so  that 

v2+V+2 

■y””1 

Cg  =  {x|xc  E  ,  x  =  £ (t),  where 


t^c  (0, 1),  i=l,  2, . . . ,  nt 
tj  *  0,  i=nt+l, ...» »t+nQ 
tj  s  1,  i=nt+nQ+ 1 , , ,  • ,  n} 


(6.64) 


and  let  P|'  be  the  dual  convex  cone  generated  by  Cg,  i.  e. 


(5. 65) 


v2*v+2 


pf '  s  E  . 

s7  x  *  0  for  all  xc  Cg} 

The  set  Pg*  consists  of  the  intersection  of  aU  half  spaces  defined  by 
hyperplanes  with  representation  xc  Cg,  and  it  is  clear  that  Pg  Cpg' 
For  a  given  x(t}€  Cg,  this  requires  that 


a.  £  x(t_)  =  0 

k*  =  0 

c.  56  0 

d.  [.*  x(t)}|  *  0 


i=l,  2, » « •  n. 


1 ,  2,  •  •  • ,  Hq 


i=nt+nQ+l, . . . ,  n 


(5. 66) 


By  using  <5. 66a)  and  (5. 66b)  we  may  remove  the  dependence  upon  t_ 
and  thus  find  an  equation  for  the  surface  of  Pg'  as  the  relevant  com~ 
ponents  of  £  vary  over  (0, 1).  To  do  this,  we  expand  (5. 66b)  to  get, 
using  the  notational  conventions  defined  previously  for  s. 


(5.  67) 


i-1  n 

#i0  +  2ti  *ii  +]C  *ji  *i  *  £  *ij  1 
j=i  j=T+i 


j=°  w** . n, 


When  the  known  values  t ^  *  0  and  =  1  are  substituted  into  (5.  67), 
there  remain  n^  linear  equations  in  the  n(  unknowns  tj,  i«l,  2, . . . ,  n^. 
These  may  be  represented  in  the  form 
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(5.  68) 


un 

*12  ‘  *  *  *  * 

* 

lnt 

812 

2822 

#2n. 

t 

*13 

®23  2s33 

8lnt  82nt 


2s_  _  t 
tt  nt 


810  + 


w 


V +  E  V 

L  i^+yl  J 


Suppose  (5.  68)  were  solved  for  the  components  t.,  i=l,  2, . . .  n 

X  v 

and  the  results  substituted  in  (5.  66a).  In  solving  for  the  t.,  any 
denominator  terms  will  contain  only  elements  s..  which  corresponded 
to  quadratic  elements  t*  or  t^  in  (5.  66a).  Furthermore,  num¬ 
erators  will  contain  terms  for  which  t.  =  0  or  t.  =  1  or  terms  which 

J  J 

correspond  to  linear  functions  of  tj,  that  is,  elements  s.Q.  Finally, 
Sqq  does  not  appear  in  the  solutions  for  the  elements  t..  Thus 
inserting  the  expressions  for  t^  in  (5.  66a)  and  clearing  of  fractions 
gives  an  equation  of  the  form 

(5.69) 

n  n 

*00  ho<£> +  E<‘ i  0  hi(i>  +  E  ’i0  *j0  hij  +  «<£>  *  0 

1=1  j=l 


where  the  functions  of  £  indicated  are  functions  only  of  the  higher 
order  terms  s. .,  i,  jj^O.  Many  of  the  functions  are  in  fact  zero 
and  are  retained  only  to  keep  the  expression  (5.  69)  simple  and 
symmetrical,  since  their  exact  nature  is  unimportant  for  our 
purposes. 
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Havir/  developed  characteristics  (5. 61),  (5. 62),  and  (5. 63) 
o £  the  boundary  of  R  and  characteristics  (5. 69)  of  the  boundary  of 
Pg',  we  proceed  to  examine  the  nature  of  w^z).  In  the  usual 
manner  we  bias  the  (0,  0)  term  of  the  matrix  of  (5.  60)  by  subtracting 
a  parameter  CC  and  then  forming  S(A(z),  R,  ft).  From  (5.  60)  we  see 
that  a  particular  element  rc  R  is  mapped  as  follows  into  space. 


B00  =  —  ^  D£  +  d^z  +  6  +  (zT A  +  £} 


10 


20 


Lrm0j 


m 


ii 


i=l 


m-JL  m  ' 

+S 


*10 

r10 

*20 

• 

a 

• 

T  T 

=  K  *  + 

*20 

% 

a 

a 

a 

„rm0. 

8ii  s  Qii  i*l»  2,  •  •  • ,  n 

*ij  =  9ij  +  Qji  lBl»  2*  *  *  *  ’  n"1;  jsi+I’ i+2,  *  *  *  *  n 


(5.70) 


These  coordinates  must  lie,  for  the  maximum  O,  on  the 
boundary  of  Pg',  and  thus  must  satisfy  (5.  69).  Substituting  (5.  70) 
into  (5.  69),  recognising  that  s^  is  a  constant  for  i,  j^0  and  that 


s.Q  is  linear  in  z  and  in  r^,  and  using  the  fact  that  hg(s}7*0  by  the 
nature  of  £  we  can  write  the  Sqq  point  in  the  hyperplane  cor¬ 
responding  to  s.j.i,  j^O,  of  S(A(z),  R,  a)  in  the  form  (for  suitable 
constant  matrices  and  vectors) 

T  ~  T  ~  T  ivT 

800  =  c0  +  £l  C2*~+*  C3*  +  £  C4^  (5’71) 

Here  we  define 

r10 
r20 

rmD 

It  is  noteworthy  that  s^q  in  (5.  71)  depends  only  on  the  first  moments 
riQ  of  the  maximizer's  strategy.  Substituting  (5.  71)  into  the  first 
equation  of  (5.  70)  and  solving  for  a  yields  the  form,  for  suitable 
matrices  and  vectors 

a  =  £  B  +  b^£  +  bj  r  •  z  B^r_  -  r_  C^r  + 

m  .  (5-73) 

+  L  Pii  rii  +Z 4  5  (Pij  +  pji)  rij 
i=l  i=l  j=T+l 

It  is  necessary  that  £€  R  be  chosen  to  maximize  a;  the 
maximum  of  a  will  be  w^(z). 

The  structure  of  the  boundary  of  R  may  now  be  exploited. 
Parameterize  (5.  73)  by  letting  r.Q  =  i=l,2, ...» m,  tjC[0, 1], 
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0 


The  boundary  region  of  interest  is  such  that  it  generates  some  pure 
strategies  and  some  mixed" strategies  for  components  of  u.  Without 
loss  of  generality,  let  the  first  a/  components,  0  £  m' £  m,  be 
associated  with  pure  strategies,  and  let., the  final  m-m'  be  mixed. 
Then  (5.  51)  implies 


riisti  i=1'2 . m 

r^  «  tj  i=m/+l, . . . ,  m 


(5.  74) 


For  the  r^,  m*  j>  i>  m* ,  for  which  mixed  strategy  cross-coupling 
occurs,  we  may  suppose  that  the  coefficients  (P. .  +  P..)  in  (5.  73)  are 
such  that,  using  (5.  62)  and  (5. 63) 


rij  =  ri0 
rij  *  rj0 
rijS° 

rij  =  ri0  +  rj0  "  1 


UJJcKj 

(i.j)€K2 

(i.j)cK3 

(iJ)«K4 


(5.  75) 


where  the  are  sets  of  integer  pairs,  and  K^U  K^UK^UK^  is  the 
set  of  all  (i,  j)  pairs,  mi  j  >  i >  m/ .  Then  (5.  73)  becomes 


«  ■  iV  +  +  b3  ♦  iji  •  A***  -  *.%£.  +g  Ptt  *i2  + 

4  pu  h  4  <pm + v*i +  £  <pij + vj + 

i=w+i  M€Ki  U.j)cK2 


(5.  76) 


(Cont'd) 
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(5 


*  E  <p« + v(ti * v‘> 

a.j)€K4 


y  £  y 


The  maximisation  of  ft  over  tj€  Lo,  1 3*  i-l»  2 , ,m  may  now  be 
performed.  Some  t.  appear  linearly  in  (5.  76)  and  take  on  values  of 
either  0  or  1  according  to  the  signs  of  their  coefficients.  For  these 
tj  which  appear  quadratically,  we  find  the  inflection  point  of  (5.  76) 

=  0  =  (b4)i  -  £T<B5)i  -  tT((C4),  -  (cj).)  (5.  77) 

+  2+  P^tj) 

where  the  notation  ( •  )^  indicates  i**1  element  or  column  and 


?  +  PJ  *  ia  8et  of 

«  «m%1  4/»o  K1  m  I  'I 


applicable  (i,  j) 


(5.78) 


P#(V  = 


I  m  *  i>  m' 


2PU  t£  1  <ii  m 


Equations  (5.  78)  are  purposely  left  vague,  since  they  depend  upon 
which  sets  contain  index  i,  and  in  what  manner  it  is  contained. 
This  is  not  important  to  our  argument,  since  f5  is  constant  in  any 
case.  The  set  of  aquations  (5.  77)  is  linear  in  js  and  t_,  and  the 
coefficients  of  Jt_  are  known  constants.  The  equation  set  may  in 
principle  be  solved  so  that  t^  €  [  0, 1 3*  although  in  practice  con¬ 
straining  the  values  to  this  bounded  set  may  be  a  nuisance.  A 
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solution,  perhaps  not  unique,  must  exist  by  the  nature  of  the  problem, 
and  after  all  the  extremal  values  of  t^  have  been  found,  there  will 
remain  a  set  equations  of  the  form  (5. 77)  in  which  some  number  k  of 
the  components  of  ^t  are  unknown  and  the  same  number  k  of  equations 
may  be  solved.  It  is  clear  that  the  unknown  components  must  be 
linear  functions  of  z,  a  crucial  point. 

Therefore  the  elements  t.,  i=l,  2,  .  . . ,  m  which  maximize  a 
are  either  zero  or  one  in  value  or  are  linear  functions  of  z.  Sub¬ 
stituting  them  into  (5. 76)  clearly  gives  the  desired  result,  i.  e. , 

Of  =  w.(z)  is  a  quadratic  function  of  z. 

Since  both  S(A{z),  R,  a)  and  Pg  must  by  their  nature  have 
finite  numbers  of  recognizable  surfaces,  i.  e.,  boundary  regions  for 
which  a  single  equation  set  or  parameterization  rule  may  be  used  to 
describe  the  region,  the  arguments  above  may  be  repeated  for  each 
pair  of  surfaces.  Therefore  w^(z)  is  piecewise  quadratic.  We  have 
proven  the  following  theorem. 

Theorem  5.  7:  The  N-stage  game  starting  at  stage  i  with  linear 

dynamics,  quadratic  payoff  function,  and  controls 
chosen  from  unit  hypercub**  a*  each  stage  has 
a  piecewise  quadratic  value  function. 

This  theorem  holds  whether  open-loop  or  closed-loop  strate¬ 
gies  are  involved.  It  is  particularly  significant  for  the  closed  loop 
case,  for  it  implies  that  the  principle  of  optimality  may  be  applied 
to  give  exact  solutions.  It  is  also  significant  for  numerical  solutions, 


since  computation  of  the  value  is  then  reduced  to  determination  of 
coefficients. 

5.6  SUMMARY 

In  this  chapter  certain  multistage  games  were  shown  to  be 
reducible  to  sequences  of  separable  static  games  in  which  the  state 
vector  is  a  simple  parameter.  The  continuity  characteristics  of  the 
optimal  solutions  were  then  extensively  investigated.  Finally,  the 
method  of  dual  cones  was  applied  to  linear -quadratic  games  and  it 
was  demonstrated  that  the  value  function  is  not  only  continuous,  but 
piecewise  quadratic. 

The  implications  of  these  results  are  obvious:  certain 
dynamic  games  can  be  solved.  This  can  be  done,  at  least  in 
principle,  for  all  games  (of  the  class  studied  here)  with  open  loop 
strategies  and  for  linear -quadratic  games  with  closed  loop  strategies; 
it  may  also  be  possible  for  other  games.  Furthermore,  the  con¬ 
tinuity  properties  and  the  nature  of  the  method  of  dual  cones 
guarantee  that  numerical  approximation  is  both  straightforward 
and  appropriate.  This  latter  point  should  prove  to  be  particularly 
important  for  applications. 


CHAPTER  6 


EXAMPLES 


In  this  chapter  are  several  examples  which  illustrate  the 
ideas  involved  in  solving  polynomial  multistage  games  using  the 
method  of  dual  cones.  The  examples  are  of  low  dimension  so  that 
the  geometric  interrelationships  may  be  visualized  and  are  motivated 
by  using  a  multistage  formulation  even  when  it  is  not  the  multistage 
character  which  is  of  primary  interest.  The  demonstrative  value  of 
the  models  is  emphasized  rather  tban  the  intrinsic  value. 

6. 1  A  LINEAR-QUADRATIC  SCALAR  PROBLEM 

The  first  example  is  an  extremely  simple  one  which  we  shall 
examine  in  detail;  its  simplicity  is  such  that  we  may  concentrate 
on  our  techniques  and  not  be  distracted  by  algebraic  detail. 

Let  z  be  a  scalar  state  variable  and  let  u#  c  [-  j,  ^3, 
v#  C  [-  ^-3  be  scalar  controls  for  a  system  with  dynamics 

z(i+l)  =  z(i)  +  u/(i)  +  v/(i)  (6.1) 

Suppose  that  an  N- stage  game  with  final  value  payoff 

J=z2(N+1)  (6.2) 

is  to  be  played  using  this  system,  with  player  I  choosing  u'(i)  and 
maximizing  and  player  II  choosing  v^i)  and  minimizing,  where 
i=  1 ,  2, . . . ,  N.  Let  us  agree,  since  the  parameters  are  scalars, 
to  use  subscripts  to  indicate  the  stage  index,  =  z(i),  etc. ,  and 

Preceding  page  blank 
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0 


let  ue  transform  the  control*  usi^  b  u'(i)  +  y,  «  v'(i)  ♦  j  *o 

that  the  dynamics  (6. 1)  become  O 


i+1 


*i  -  1  + 


V  vi 


(6.3) 


where  ^  c  [0,  l],  Vj  <  [0,1  ]  as  required  by  our  paradigm. 

The  solution  to  this  problem  appears  intuitively  obvious  except 
near  the  origin  a  *  0:  the  maxLxziiser-will  chooee  his  control  to  get 
asfarfromtbi  origin,  a  *  0,.  as  possible  and  the  minimiser  will 
attempt  to  cause  Bjj^  to  be  near  the  origin.:  Thus  for  z^>>  0,  for 
example,  Uj  ?  -y,  ^  is  obvious,  so  that  a^j  =  and  Sjj+i  = 


For  a^M  0,  however,  intuition  is  not; so;  helpful;  e.g. ,  if  a^  =  0, 
then 


n^n  mn  4+1  *  « 
yN  % 

(6.4) 

*»»*  n$n  *N+1  “  1 
T*  vN 


and  the  neer*  for  a  mixed  strategy  for  one  or  both  players  is  apparent. 
We  shall  find  those  mixed  strategies  and  also  verify  the  intuitive  pure 
strategies. 

Let  us  first  solve  the  single-stage,  or  one- stage -to-go, 
problem.  For  ease  of  notation,  define  u  *  u^,  v  *  v^,  a  *  »N  -  1, 
so  that 


and 


*N+1 


«  a  +  u  +  v 


(6.  5) 


0 


0. 


0 


D 


O 


t 


Oi 


9 


I 

\ 
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J  =  (z  +  u  +  v)^  =  J(z,  u,  v) 


(6.6) 


We  seek  cumulative  distribution  functions  F°(u|  z)  and  G°{uj  z) 


such  that 


w(z)  =  “jj*  /7j{z.  u,  v)  dF°(vJ  z)  dG(v) 


(6.7) 


*  U' v)  dG°<v{  Z)  dF<tt) 


.  UV 


Expanding  J  and  writing  it  in  matrix  form  yields 


z^  2z 


ii  r 


=  G(vU)  E  j  [i  u  u2]  21  2  ° 


1  0  0  v 


»  I  >  (6.8) 


By  subtracting  w(z)  from  both  sides  an  defining 


Pj  *  E[u*3  -  J* i 


u1  dF(u|  z)  i=0|  1,2 


•j  =  ECvJ]  dG(v| 


z)  jeO.1,2 


(6.9) 


we  may  write  (6.  8)  as 
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€ 


P- 


w(z)  2s  1 1  I  1 


®-SSS5»  *i  ^  2* 


2  0  sj  (6.10) 


0  0  », 


where  S  and  R  are  the  sets  of  admissible  moment  vectors  [  sQ,  sJ#  s^] 

m 

and  [y0,  r^»  ,  respectively,  and  sQ  *  r^  «  1. 

The  set  is  given  parametrically  by  =  {jr  { -  1, 
rj  =  t,  r2  =  t2,  t  € [0,  l]}f  and  R  is  the  convex  hull  of  this  set.  The 
significant  cross-sections  and  R  are  shown  in  Figure  6-1.  We 
see  that  R  =  {rj  Xq  =1*  r£  *  r2»  r^  *  r2»  r^clO,  1 3).  The  sets  Cg 
and  S  are  identical  in  form  to  and  R. 

The  cone  Pg  is  easily  constructed  using  the  cross-section  S, 
i.  e. ,  Pg  =  {si  | £  *  Xs/  for  some  X  *  0  and  £eS}.  This  set  is  drawn 
in  Figure  6-2. 


The  dual  cons  P*  is  slightly  more  difficult  to  visualise.  By 
5 


definition 


Pg  =  US  sTx*  0,  Vx  c  Pg} 


(6.11) 


Let  us  use  one  illuminating  method  of  construction.  Pick  a  par¬ 
ticular  point  x q€ Pg  and  consider  the  set  PgCx^) 


p|  (*o )  *  t*J±T*o*  °) 


(6.12) 


This  will  be  a  half-space  in  with  boundary  points  s°  such  that 
8°^x  =  0  (Figure  6-3),  The  region  in  the  direction  of  positive  s^ 


-***•»«•'*•  ‘u,  y^*-.**#* #.«*>■»*-  *'*'V*" 


0 


belong*  to  Pg  For  two  point*  x*  sad  Xj  in  Pg,  we  see  that  only 

point*  *_  belonging  to  both  hall  space*  can  belong  to  Pg;  i.  e. ,  sePg 
implies  s<Pg  (xj  n  Pg  (Xj).  In  £act  scPg  implies  that  scPg  (x^  n 
(jj)  H  ••••  for  all  x^cPg.  Therefore  a  boundary 
point  of  Pg  must,  belong  to  Pg  (x)  for  all  xcPg  and  must  be  a  boundary 
point  of  Pg(x)  for  at  least  one  x€Pg.  - 

From  Theorem  4. 6,  we  know  that  boundary  points  of  Pg 
other  than  the  origin  can  only  be  generated,  by  points  j*  of  Pg  which 
for  some  X  >  0  have  the  property  Xsc  Cg.  Hence  .the  construction  of 
the  boundary  requires  consideration  only  of  points  s;from  the  set 

( ■)£T'  *  -  0  ior  some  xe Cg,  jjT  0  for  all^c Cg}  (6. 13) 


O 


0 


0 


0 

i 


In  this  example,  these  comments  allow  us  to  restrict  our  attention 
to  points  £  which  satisfy 

*0  +  *1*  +  §2t2  s  Oiov  some  tc  [0,1], 

(6. 14) 

^*0  +  »jt'  +  sgt'2*  0  for  all  t'  c  [0. 1  ]. 

If  tc  (0. 1),  then  for  suitable  6,  t+0c[0, 1],  and  (6.14)  is  equivalent  to 
»0  +  #jt  +  *2t2  *  °  tc(0,l) 

and  (6.15) 

•0  +  *x(t  +  6)  +  s2(t  +  5)2  a  0  t+6c[0, 1] 

This  implies  that 


•o  +  *1* +  •/  *  0 

*j6  +  *2(Zt$  +  d2)a  0 


(6.16) 


» 


t 


9 


( 


0 


€ 


9 


9 


0 


< 


(6.17) 

(6. 18a) 

(6.18b) 

(6.18c) 


as  other  boundary  surfaces.  Combining  (6. 18a) -(6. 18c)  yields  the 
boundaries  of  Pg  (Figure  6*4).  These  are  more  easily  visualised  if 
the  pair  of  cross-sections  in  Figure  6*5  are  considered. 

With  R  and  Pg  known,  we  are  ready  to  proceed  with  the 
'  problem  solution.  Let  us  use  the  matrix  of  (6. 10)  to  map  R  into 
S-  space;  i.  e. ,  define 

5(A(s),R,f)  =  {sj3£«  R  >Sq  «  s2  -  f  +  2srj  +  r 2, 

(6.19) 

»j  =  2z  +  2rj,  s2  »  1} 


For  convenience,  let  us  denote  S(A(s),  R,  f)  by  S(s,  f).  Then  if  f  =  w(z), 
S(z,  f)  intersects  Pg  only  at  boundary  points.  We  see  that  for  all 
f  and  z,  £€  S(z,  f)  implies  s2  s  1,  so  that  the  intersection  of  the  sets 
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must  occur  for  this  value  of  sad  we  need  only  consider  the  »£  =  1 
cross-section  of  Pg.  This  cross-section  is  given  in  Figure  6-5(b). 

Let  S#(z,  f)  be  the  projection  of  S(A(z),  R. f )  on  the  *2=1  plane. 

Let  us  now  consider  sample  values  of  z  and  f  and  perform 
the  mapping  of  (6. 19). 

S'(l,  0)  «  {sq,  Sj  |3r_c R  >*0  =  1  +  2rj  +  r2*  Sj  =  2  +  2r,  3 
S'{1,4)  =  {sq,  Sjj3r.fR  =  -3  +  2rj  +  r^»  *j  =  2  +  2rjj  (6.20) 
S'(l,  2)  =  {e^,  *j|3£€R">Sq  =  -l  +  2r  j  +  r2»  *j  -  2  +  2rj} 

These  sets  are  shown  in  Figure  6-6.  Performing  the  mapping  is 
aided  considerably  by  the  fact  that,  for  given  z,  it  is  a  linear  mapping. 
Thus  the  straight  line  segment  =  r2  maps  into  a  straight  line  seg- 

ment  *0  *  I  *1  -  *  -  2'  “d  the  ot  r2  =  rl  map‘  “t0  a 
1  2 

segment  of  sQ  =  ^  8j  -  f. 

Examination  of  Figure  6-6  reveals  forcefully  the  effect  of  f  in 
causing  the  translation  of  S7(z,  f)  parallel  to  the  s^-axis.  Further¬ 
more.  it  is  obvious  that  w(l)  is  the  maximum  value  of  f  for  which 

J 

S(l,f)OP|  t  or  alternatively  the  miaimum  f  for  which  a  separating 
plane  for  S(l,  f)  and  Pg  exists.  Since  f  =  4  has  the  desired  qualities, 
w(l)  =  4.  This  occurs  for  s  rg  s  X*  *o  that  the  pure  strategy 
F°(u)  =  I^u)  suffices  for  the  maxL„  ’,zer.  The  separating  hyperplane 
is  Bq  =  0,  implying  that  the  pure  strategy  G°(v)  =  1q(v)  is  used  by  the 
minimizer.  (As  usual  the  function  I  (y)  =  1  for  y  4  x,  X  (y)  =  0, 
y  <  x,  is  used. ) 

Before  evaluating  w(z)  in  general,  let  us  examine  two  more 


0 


sample  values  of  z. 


S'(-3,  0)  =  {s0,  sja  £€R  >s0  =  9  -  6rj  +  r2,  Sj  =  -6  +  2^} 


S^-3, 4)  =  CsQ,  8^3  r€R>  s0  =  5  -  6rj  +  r^,  Sj  = 

S,(~3*6)  =  { Sqi  Sjja  reR  >sQ  =  3  -  6r^  +  r 2»  Sj  = 

sVIj-U^tCsq.  ®j-i3  £€R  >  cQ  4  2  -  2rj  +  r2,  = 

S'Mfjjj  s  {s^,'Sj|3  rcR  )-Sg  s|  -  2r^  +  r2>  s^  = 

S#(-l,2)  *  C«0»  •jjS  £«R>s0  =-l  -  2r.j  +  r 2,  $1  = 


-6  +  2i-j} 


-6  +  2rj} 


(6.21) 


-2  +  2rj} 


-2  +  2rj) 


-2  +  2rt) 


These  sets  are  sketched  in  Figure'  6-7.  Looking  first  at  the  sets 
s(-3,f),  we  see  that  S(-3, 6)  does  not  intersect  Fjjjjj,  that  S{-3,  0)  lies 
entirely  within  Pjg  and  thus  does  not  have  a  hyperplane  separating  it 
from  Pjjfc,  and  that  S(-3«  4)  appears  to  both  intersect  and  share  the 
separating  hyperplane  sQ  +  Sj  *  -1.  Thus  it  appears  that  w(-3)  =  4, 
and  G°(v)  =  l^(v).  Furthermore,  the  intersection  point  8q  =  5, 

Sj  =  -6  cor  re  spends  to  rj  -  r2  *  6  in  R  for  f  =  4,  and  thus 
P°(u)  .  I0(u). 

For  the  sets  S(-l,f),  it  appears  graphically  that  w(-l)  = 
that  the  separating  plane  is  2s^  *  s  a°d  that  for  the  point  of 
contact  *  -j-,  8j  s  -1,  the  corresponding  ircR  is  r^  =  r2  =  y. 
Therefore  optimal  strategies  are  G°(v)  =  I^(v)  and  F°(u)  =  ^Iq(u)  + 
jlj(u),  where  the  latter  indicates  a  50-50  mix  of  u  *  0  and  us  1  for 
the  maximizer.  These  values  will  be  verified  algebraically  below. 


With  the  insight  gained  from  the  special  cases,  we  may  pro* 
ceed  to  consider  more  general  values  of  z.  Note  first  that  every 
tangent  to  the  cross-section  of  the  boundary  of  Pg  at  *2  =  1  cor¬ 
responds  to  a  point  of  Cg;  hence  the  minimiser  uses  only  pure 
strategies.  Oh  the  other  hand,  for  each  r|  corresponding  to  at  least 

K  - 

one  rcR  the  image  points  <scS'(z,f)  have  the  .property  that  for  fixed 
Sj,  the  value  of  ig  for  =  r^  is  greater  thin  or  equal  to  Sg  for 

7  '  • 

r2  =  r^.  Therefore  all  optimalinte r se  ctions  of  S  (s,f)  with  P|  lie 
on  the  line  corresponding  to  r^  =  r^  in  R- space,  and: the  maximise r 
always  uses  one  of  his  extreme  points  u  s  0  or  u  =  1,  or  a  mixture 
of  these  two  points.  For  this  reason  we  need  only  be  concerned  with 
the  line  segments  in  s'(z,f)  given. by 

Sq  *  +  (2s  +  l)t 

tc[0,l]  (6.22) 

•Sj  =  2(z  +  t) 

in  our  analysis.  Equations  (6. 22)  may  be  written  with  t  eliminated 

as 

*0  s  (.f  -  a2  -  z)  +  (s  +  j)Sj  (6.  23) 

In  the  proofs  in  Chapter  5,  the  properties  of  simple  algebraic 
maximisation  were  emphasised.  For  variety,  let  us  utilise  here 
geometric  properties  of  slope  and  support  hyperplanes. 

From  Figure  6-5(b)  it  can  be  seen  that  the  slope  dSg/dSj 
of  the  boundary  of  Pg  is  between  -1  and  0.  Therefore  if  for  given  s  the 
slope  of  the  boundary  line  of  S(s,  f)  is  either  less  than  -1  or  greater 


than  zero,  we  may  be  sure- that  the  maximise  ruses  one  of  his  pure 

Zj 

end  point  strategies  u  =  0  or  u  =  1 .  “  Frpm,(6.23),,  ds0/dsj|gj2  ^  = 

1  -  1  3 

a  +  ig,  Henca,  u  uses  pure  strategies  for  z  >  «jor  z  <  --j.  For 

z  >  -i,  (o.  22)  shows  that  sAU .  >  occurs  for  t  =  1  and  that  therefore 

Sj  >  0  at  the  contact  point  of  S(z,  w)  and  P|.  It  immediately  follows 

that  a  separating  plane  is  s^  =  0.  Substituting  t  =  1  and  8q  =  0  in 

(6. 22)  gives  w(z)  =  f  =  z2  +-.2s  4*  1  =  (z  +  l)2.  Furthermore,  t  *  1 

gives  F°(u|z)  =  lj(u),  and  Sq  =  0  for  the  separating  plane  gives 

G°(v|  z)  =  Iq{v).  These, hold^for  z  >  -y* 

3 

If  z  < -Tgi  then  s j  <  -2  from  (6.  22).  In  this  region  a  support 

hyperplane  and  contact  set  wfth  Pg  ii  Sq  +  Sj  =  -1,  implying 

G°(v|  z)  =  Ij.(y).  The  maximum  for  s^  is  at  t  =  0.  Since  the  contact 

2 

point  occurs  on  Sq  +  Sj  +  1=0,  we  have  a  -  f  +  2z  +  1  =  0.  Hence, 

w(z)  =  f  =  (z  +  l)2  and  F°(u|  z)  =  1q(u). 

3  1 

For  the  region  ze(>.^»  -  j),  the  slope  of  (6.  23)  lies  in  (-1,  0), 
and  SjC{-2,  0)  for  some  values  of  t;.(See  (6. 22)).  Therefore  tangency 
of  (6.  23)  with  the  curve  Sq  -  ^  Sj  *  0  must  be  considered  in  deter¬ 
mining  the  optimum  payoff.  The  slopes  of  the  two  curves  must  be 
equal  for  tangency  (and  thus  a  separating  plane)  to  occur.  This  re¬ 
quires 

+  Sj/2  =  z  +  jj-  (6.  24) 

or 

Sj  =  +2z  +  1  (6.  25) 

at  the  point  of  contact.  Using  (6.  22),  this  implies 

t=£.  (6.26) 
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c 


Hence  F°(u|  z)  a  -y  IQ(u)  +  ^Ij{u),  because  ere  are  working  with  the 
r ^  »  r2  s  t  bbundary-of  R.  OuPg 


so  that  since 


*0  *  ?<-»!>*  *  *2  +  *  *’i 


(6. 27) 


0 


i0  =  z2-£+s  +  | 


(6. 28) 


_  1 


on  S{z,  £)  in  this  region.,  eliminating  yields  £  *-  w(z)  =  ^ .  The 
minimized*  pure  strategy  is  concentrated  at  T  =  - 
i.  e. ,  G°fv|  z)  =  I-a  .|(v). 

3  1 

The  cases  z  =  -^  and  z  =  are  easily  evaluated;  the  sets 
S/(-  -j,  w(-^))and  S*(-y»w( ->*•))  are  shown  in  Figure  6-8.  For 
z  =  --j.  w(-^)  =  ^  G°(vj  z)  =  IQ(v)  and  F°(u|  z)  =  al0(u)  +  (l  -a)  Ij(u) 
where  af  [O.JjO*  i.  e. .  the  maximizer  has  a  choice  of  optimal  strate¬ 
gies.  Similarly.  £or  z  *  -  j,  w(-y)  a  G°(v|z)  *  lj(v),  and 
F°(u|  z)  =  aIQ{u)  +  <1  -  «)  lj(u)  where  Ctc[^>  l]. 

The  results  in  terms  of  z^  are  summarized  in  Table  6-1. 


O 


y 


y 
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Figure  6-8.  Sets  tangent  to  P 


0 


Table  6-1.  Re  suite  lor  One  Stage  of  Example  1 


*N 

FN-1(u*  *N* 

WN(*N> 

o 

<4 

I0(“) 

(SN)2 

- 

1 

'7  - 

al0(u)  +  (l-a)Ij(u) 

Ij(v) 

1 

% 

0 

' 

acDy.J] 

"7*  ZNX7 

1 

4 

0 

+  7 

otIQ(u)  +  (l-all^u) 

I0«v) 

1 

? 

o 

>l7 

I^u) 

10(V) 

(®N>2 

4> 

The  results  may  also  be  written  in  terms  of  and  by  the 

obvious  transformations.  Note  that  the  payoff  may  b.t  written  Tj 

wN(sN)  =  max  [«Jj,  -JO*  (6. 29) 

This  is  a  piecewise  quadratic  as  expected  from  the  theory.  V 

To  find  wn-1^zN-1^’  we  *®P®®t  the  basic  processes  above. 

Now,  however,  it  is  necessary  to  allow  for  the  piecewise  quadraticity 

of  Wj(j(Zj1j).  Certainly  for  z^_^  *  --y  or  z^j  >  only  th®  curve  O 

2  2  1 

is  applicable,  for  the  region  z^  <  ^  is  unattainable  for  any 

admissible  controls  u^j  c[0,  l],  v^_j  <[0,  l].  In  this  region,  then, 

the  results  of  Table  6-1  will  apply  with  suitable  changes  in  subscript.  2 


I 


I 


I 


I 


t 


t 


t 


t 


% 


c 


r  1  In  _ 

3-  Wtoic.SjiCt^'sp  ’f-Jis  attainable;  the  situation  is  more  com¬ 
plicated.  There  are  severalwaysto  argueconcerningthe  establish- 

ment  of  the  rvalue  andstrategies  for  tMsregion;  one  interesting 

o  ' 

technique  is  to  use  attainable -set  arguments.  Let  us  instead  approxi- 

,  "  .  •  1 

mate  the  polynomial  p(z)  *  by.the  polynomial  _ 

Pf(0  =  €i|  +  a.e)|_=«(4-i>+|  e«to»ij  (6.30) 

Then  as  €  -•  0,  Also 

*•  ^ 

wN(*^>  s  ^(*N  ”  T>  +  (6.31) 

has  the  same  points  of  discontinuity  of  dw^/dz^  as  dw^/dz^.  Let 
us  evaluate  the  game  Pg(*j,j)  8^ven  *N-l*  K  W€^ZN-1^  denote*  the 


value,  then 

(6.  32) 

(afc-r^4 

2Ufc_ri> 

1 

1 

w #f-  \-Ixe  mln  waxri 
w€(zN-1)“?  +  €G(v)  F(u)LI 

u  u2] 

2(^.l-D 

2 

0 

V 

1 

0 

CL 

V2. 

If  we  define  z  we  see  immediately  that  the  portion  of 

(6.  32)  of  interest,  i.  e. ,  the  portion  to  be  mini-maxed,  is  the  same  to 
within  a  bias  constant  as  the  one-stage  problem  (6.  8).  Therefore 
the  strategies  for  the  game  Pg(*j^_j)  are  independent  of  6  and  are  the 
same  as  those  for  the  game  wj^zN^'  The  value  is 

w€(zN-l)smaxU  +  6taN-f  ?]  (6,33) 
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A*  €  -» 0,  it  ii  cletr  Suitable  strategic  •  for  the 

limit  gameare,  by  continuity  arguments  similar  to  Lemma  B, 
limits  of  thr  strategies  for  thegame  jj,  which  we  already 

noted  are  independent  of  €. 

'2  . 

The  game  («jj^j )  is  precisely' the  same  ai  w^(x^)  except 

2 

for  subscripts,  and  has  the  same  form  of  strategy.  Thus  j) 
and  pg(z^i)  have  common  optimal  strategies,  which  may  easily  be  ' 
read  from  Table  6-1.  Either by , inserting  these  strategies  into 
(6.29)  or  by  arguing  concerning  the  continuity  of  the  payoff  and  the 
fact  that  each  branch  of  the  game  iv  lower-bounded  by 

we  find  that 

^N-1%-1)  *  max[*N-l»  ??  (6.34) 

Noting  that  this  is  of  the  same  form  as  (6. 29)  and  that  we  have 
already  argued  that  the  optimal  strategies  are  of  the  form  in 
Table  6-1,  we  s«e  that  the  multistage  game  is  in  fact  solved,  and  in 
terms  of  the  original  definitions  (6. 1)  the  results  may  be  sum¬ 
marized  in  Table  6-XI. 


Table  6-2.  Reraise  for  Example  1 


X|(irl#  ...  *  >  N) 

GiKiV 

siK’Jz  5 

*-* 

‘i=  "7 

al_|  r 

*1 

i 

T 

- 

’7<*i<7 

1  T  x  1  T  * 

S 

1 

* 

<h|N 

II 

al^  +  (l-ayi^* 
ac[o,j] 

.  l-i 

1 

J 

*i>7 

*1 

*-» 

<*/ 

6.2  COUNTER-EXAMPLE;  A  NON-POLYNOMIAL  VALUE 

A*  pointed  out  in  Chapter  5,  a  polynomial  game  cannot  be 
expected  in  general  to  have  a  value  function  which  is  a  polynomial 
in  a.  A  simple  example  will  demonstrate  this. 

Suppose  that  u,  v,  and  z  are  scalars»  that 

J(z,u)  *  *2(N+1)  -  u2(N)  (6. 


and  that 


♦These  are  optimal  strategies.  For  i  <  N  it  may  be  shown  that 
other  optimal  strategies  also  exist. 


(6.  36) 


z(N+l)  =  z(N)  +  (z(N)  +  l)u(N)  +  v(N) 


We  are  interested  in  finding  w^(z(N)).  Any  other  stages  of  the  game 
are  not  of  interest  in  this  example.  We  assume  that  u(N)c  [0, 1 3, 
v(N)CL0,l3. 

For  ease  of  notation,  certain  subscripts  may  be  dropped  so 
that  zp  z(N),  u  =  u(N),  and:v  =  v(N).  The  usual  steps  of  substituting 


(6.  36)  into  (6. 35)  and  writing  out  the  expression  for  w^(z)  give 

wN(z|  *■  £(“z>  §(v|z)EU2  *  2l(l+I)  U+  22V  +  [<Z+1)2  '  l]  1,2  +  (6.  37) 

2(z+l)  uv  +  v2] 


In  matrix  notation,  this  is 


(6.  38) 


z2-  wN(z) 

2z 

1 

2z(z+l) 

2(z+l) 

0 

(z+l)2-l 

0 

0 

r-  *V\ 
1 


2 

LV  J  J 


Using  the  moment  definitions  from  the  first  example,  (6,  38)  be- 
comes  (6. 39) 


z2  -  wN(z) 

2z 

1 

l 

n  max 
“  r€  R 

min  [l  r 
MS  11  rl 

'i1 

2z(z+l) 

2(z+l) 

0 

81 

(z+l)2-l 

0 

0 

S2 

wm  «■ 

Since  the  controls  appear  quadratica.’ly,  the  sets  R,  S,  and 
Pjj  are  the  same  as  those  of  Example  1.  (Figures  6-1,  6-4,  6-5). 


As  in  that  example,  form  the  sets 


S'(M)  =  (sg,  Sj  |3  rc R  >s0  =  z2  -  f  +  2z(z+l)  tx  (6. 40) 

+  (*+l)2  r2  -  r2. 

Sj  -  Zz  +  2(z+l)  Tj  } 

and  note  that  seS(A(z),  R,f)  implies  s2  =  1,  so  that  only  a  cross- 
section  of  Pg  need  be  considered  (Figure  6-5(b)). 

Once  again  the  minimi zer  will  use  pure  strategies,  whereas 
(because  of  the  varying  coefficient,  of  r2  in  the  equation  for  8g)  the 
maximizer  may  use  either  mixed  or  pure  strategies.  In  S'(z,  f), 
the  line  r ^  generates  a  segment  of 

a0  =  z2  -  f  +  (Sj  -  2z)  (6.41) 

Evaluating  cases  as  before,  we  find  that  for  seS/  (z,f),  ^  0  for 

all  r  j  if  z  *  0.  Therefore  in  this  range  G°(v|  z)  =  Ig(v)  and  (because 

2 

the  contact  line  is  Sg  =  0)  w(z)  =  4z  +  4z.  Furthermore,  since 
rl  =  1  =  r2  *8  the  best  choice  of  moments  for  the  maximizer, 

F°(u|  z)  =  Ij  (u).  The  strategy  is  arbitrary  for  z  -  0. 

If  z  *  -1,  then  8^  s  -2  and  the  intersection  of  S'(z,  w(z))  with 
Pg  lies  on  the  line  Sq  +  Sj  +  1  =  0.  Therefore 

f  =  z2  +  (2z2  +  4z  +  2)  +  z(z+2)  r2  +  2z  +  1  (6.  42) 

2 

If  z  s  -2,  then  clearly  =  r2  a  1  is  optimum,  yielding  w(z)  =  4z  + 

8z  +  3,  G°(v|  z)  =  Ij(v),  and  F°(u|  z)  =  Ij^(u).  If-2<z<-l,  then 


the  coefficients  of  and  in  (6. 42)  have  opposite  signs,  suggesting 

a  pure  strategy  solution  for  the  maximising  player.  Maximising 

Z 

(6.42 }  over  rj  =  t,  r^  *  t  requires 


t  -  - 


*  +  2*  +  1 


(6.43) 


which  after  imposing  the  limits  fc[0,  l  ]  implies 


-1  - 


-1>  -1 


-4- 


(6. 44) 


Thus  w(s)  4z 2  +  6s  +  3,  G°(vj  s)  =  lj(v),  and  F°(u|  z)  =  Ij(u)  for 
z  ^  -  1  -  .  For  -I  ^  s  *  -1  •  -  ^  ,  (6. 42)  and  (6. 44)  imply 


w<«)  =  t  =  X2  -  +  2z  +  1  =  -ffijjy 


Also  G°(vjz)  *  Ij(v),  and  F°(ujz)  a  It(u),  where 

-(s+1)2 

1  ~  zrihy 


(6. 45) 


For  -1  <  x<  0,  examination  of  (6. 40)  reveals  that  the  coef¬ 
ficient  of  r,  is  negative,  inlying  that  the  maximizer  will  use  pure 
strategies.  Parameterizing  S'(z,w(z))  by  =•  t,  r?  =  t  and  inserting 
in  the  equation  (See  Figure  6-5(b))  for  the  boundary  of  P? 

(6.46) 

Sq  =  z2  -  f  +  2z(z+l)t  +  z(z+2)t2  a  (s2  +  2z(z+l)t  +  (z+l)2  t2)=  | 


''  *■  -ift  V 


Here  t  »  0  it  the  obvious  choice;  i.e. ,  F°(u|z)  =  Iq(u),  in  this  region. 

The  intersection  point  with  P|  has  Sj  =  Zz,  implying  the  pure  strategy 

G°(v|  z)  =;I  (v)  for. the  minimize r.  From  (6. 47)  it  is  clear  that 

-z 

w(z)  =  ?.  Table  6*111  summarizes  the  solution  and  Figure  6-9  shows 
representative  S*(z,  w(z))  sets.  Of  particular  interest  is  that  for 
zcL-l  -^pr,  -  l  ]  w(z),  is  rational  but  not  a  polynomial.  Therefore, 
if  a  further  stage  is  to  be  solved,  the  method  of  dual  cones  is  unlikely 
to  be  applicable. 

Table  6 -HI.  Solutions  for  Example  2 


A  SIMPLE  PROBLEM  WITH  .VECTORS 


The  biggest  obstacle  to  finding  solutions  of  a  non-numerical 
nature  is  dimensionality,  for  spaces  larger  than  three -dimensional 
are  almost  impossible  to  visualise.  The  following  problem  is  of 
small  enough  dimension  to  be  pictured  and  still  is  an  interesting 
problem  containing  vectors. 

Let  z  and  u  be  two-dimensional  and  let  v  be  a  scalar  for  a 

•mm  —y 

system  with  dynamics 

Zjti+l)  =  Zj(i)  +  Uj{i)  -  u2(i)  +  v(i), 

r  r  {6/i 

z2(i+l)  =  s2(i)  +  — »p-  u2(i)  +  vU)» 
and  with  v(i)c£o,  l],  u^(i)c[Otl],  u2(i)c[0,  l].  For  the  payoff 


function  choose 


J  *  Zj{N+l)  +  z2(N+1)  -  Uj(N)  -  u2(N) 


As  in  the  previous  examples,  drop  the  stage  indices  after  sub¬ 
stituting  (6. 48)  into  (6. 49)  and  use  vector-matrix  form  for  J  to  get 

(6. 

Zj+Z2  VSiz;+z2)  ij 
2zj  y/z  0 

.  .  min  max  _  r,  i 

w|‘l'*2,IOWr(u|E  [lulu2ulV  „ 


Using  the  usual  definitions,  this  ma'  be  rewritten 


(6.51) 


Q  min  max  ] 

u  scS  rcR  L  rl  2  V 


zl+z2'w^ 


2z. 


0 


The  set  S  is  the  same  as  in  example  1,  as  is  Pg.  We  see  that  the 
mapping  S(A(z),R,f}  once  again  lias  s^  =  1,  so  that  Figure  6~5(b)  is 
again  usable. 

The  set  R  may  be  constructed  by  forming  the  set 

CR  =  llJ  ri  =  tl»  r2  ~  V  *x  =  *1*2’  0)  and  then  taking  its 

closure.  The  sets  CR  and  R  are  shown  in  Figure  6-10,  where  eR 
and  R  are  projections  for  ~  1  of  CR  and  R. 

A 

The  interesting  thing  about  R  is  that  it  is  a  tetrahedron  and 
has  as  its  vertices  the  points  (r^,  r2>  rx)  *  (0,  0,  0),  (1,  0,  0),  (0, 1,  0), 
(1, 1, 1).  These  points  correspond  to  pure  strategies  I  (u)  = 

r  2  “ 

I00(u)»  I10^‘  I01^'  iij(n)  respectively. 

The  set  s'(z,i),  which  is  the  projection  on  =  1  of  the  image 
of  R  for  a  given  parameter  f  and  initial  state  £is  defined  by 

(6.  52) 

S'(z,f)  =  {sq,  Sj|  sq  -•  Zj  +  -  f  +  2zjrj  +  n/2(*2  -  *j)  r2  “  & rx> 

+  z,)  +  y/i  rj,  re  R} 
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We  will  consider  the  interactions  of  this  set  with  P£  for  various  values 
of  z.  Note  that  the  maximizer's  moments  r j  and  may  be  chosen 
independently,  provided  that  the  coupling  rx  is  accounted  for. 

Case  1:  Zj  +  z^  ^  0.  In  this  region,  Sj  -yjl{z^  +  z^  +  Ji  rj  *  0 
for  all  admissible  ,  implying  the  pure  strategy  G°(v|z)  =  Iq(v)  for 
the  minimi  zer  and 

f  -  Zj  t  z^  +  2z,  r  j  +  vS(z2  -  Zj)  r2  -  i/2  r^  (6.  53) 

as  the  expression  to  be  maximized  over  rcR. 

For  Zj  <  0,  r^  =  0  is  obvious,  as  is.r2  =  0  for  z2  -  z^  <  0.  In 
both  cases  rx  =  0  follows  from  the  choice  of  rj  or  r2.  If  both 
Z1  >  an<*  ^z2  “  >  then  clearly  the  penalty  of  taking  rx  =  1  is 

worth  the  benefit  from  having  both  r  j  =  1  and  r2  =  1.  If,  however,  we 
have  Zj  >0,  z2  >  z^,  but  either  Zj  <  or  z-  <  1  +  z^,  then  further 
examination  is  necessary  to  determine  the  desired  strategy. 

Figu  e  6-11  shows  the  form  of  S'(z,  f)  for  £  in  this  region.  The  cor¬ 
ner  markings  indicate  the  points  of  R  which  generate  the  corners. 
From  this  it  is  clear  that  Iqj  or  I^q  will  be  preferred  depending  upon 
which  has  the  larger  coefficient.  Thus 

Zzl  >  \Zi(z2  -  Zj) 

or  (6. 54) 

(\/2+l)zi  >  z2 

leads  to  choice  of  1^,  the  opposite  inequality  leads  to  the  opposite 
choice,  and  equality  implies  an  arbitrary  mixture  of  the  two 
strategies.  Results  for  Case  1  are  summarized  in  Figure  6-12. 


158 


Case  2:  If  a,  «  V2C*j  4-  s^)  4-  v/2~4  -  2,  the  minimizer  uses 
the  pure  strategy  Ij(v|z)  and  the  intersection  of  S'(jz,  w(z))  with 
P£  lies  on  the  line  +  s^  +  1  =  0.  From  this  it  follows  that 


(6.  55) 


f  =  Zj  4-  z^  *5-  (2Zj  4-  y/2)t^  4-  >j2(z^  -  Zj)^  “  ^Tx  +  V2(*j  4-  Z2)  4-  1 


Arguing  as  in  Case  1,  w$  find  thr  results  which  are  summarized  in 
Figure  6-13. 

Case  3:  Zj  +  z ^  <  0,  z^  4-  Z2  >  -  1  -  ^2.  This  final  region  is 

more  involved  to  evaluate  because  the  curved  nature  of  the  boundary 
S1  2 

of  Pg  (Sq  =  {— ip*)  )  in  part  of  this  area  makes  possible  non-trivial 
mixed  strategies  for  the  maximizer  and  fractional  pure  strategies 
for  the  mimr-~.i«.<'r. 

Note  that  on  S'(z«  f)  for  r^  ^  0  we  can  relate  s^  and  Bj  by 


*0  =  A  +  72  "  f  +  2zl  ^  "  *1  “  *2)  +  ^*2  “  zl)r2  "  ^r> 

=  -  z2  ”^zlz2  +  z2  ”  i  +  ^zl*l  +  ^z2  '  *1  )r2  ’  ^rx 


(6.56) 


For  r^  =  0  or  rj  =  1,  Sj  is  constant. 

Consider  Zj  2  0,  so  that  z^  -  Zj  4  0.  Then  the  mapping 

(6.  5c)  of  R  is  of  the  form  shown  in  Figure  6-14. 

Clearly  1jq(u)  is  preferred  by  the  maximizer,  and  contact  with 

1  2 

Pg  occurs  on  the  curve  Sq  =  ^  Sj  *or  4-  Z2)  4  -  1  and  on  sQ  =  0 
for  z.  4-  z,  >  -  1.  The  strategy  for  the  minimizer  is  I.(v),  where 

Bl  Jz 

t  =  — (zj  4-  Zg  4-  1)  in  the  former  region  and  IQ  in  the 
latter,  with  the  payoff  function  evaluated  accordingly  as  either 


Figure  6-13,  Strategies  and  values  for  Case  2. 
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w(z)  - 


2  z2 

*1  *2  1 
+  T  +  T  +  *1  ‘  *1*2  ~  *2  *  2 


z2  +  z2  +  2zj 


(6.  57) 


*1  ♦  *2< 


z1  +  z2>-l 


If  Zj  <  0,  more  possibilities  arise.  Let  us  consider  the  case 
z^  -  Zj  <  0  in  some  detail.  A  possible  configuration  of  the  mapping 
of  R  is  shown  in  Figure  6-15. 

From  (6.  56)  we  know  that  the  slope  of  the  line  from  IQ0  to 
I1Qis  y/lzy  Since  the  slope  of  the  Pg  boundary  is  greater  than  -1 
and  less  than  0,  Zj  <  -  implies  that  Iqq  is  the  contact  point, 
with  suitable  interpretations  as  in  the  case  z^  >  0. 

On  the  other  hand  z^  2  -  implies  a  contact  point  either 
at  I10  or  oh  the  line  from  Iqq  to  I^q,  depending  upon  the  exact  values 
involved.  For  the  line  to  be  tangent  to  the  curve  s  8j»  the 
slopes  must  be  the  same  at  the  point  of  contact.  This  implies  that 


This  equation  along  with  the  definition  of  Sj  on  the  set  S(£*f)  gives 


=  Zy/z Zj  =  y/i( Zj  +  z2)  +  y/I  rj 


r,  =  I*!  -  ,2) 


(6.  58) 


.Since  0  *  rj  *  1,  the  limits  of  the  range  of  internal  contact  are  clear. 
Where  it  applies,  the  mixed  strategy  for  player  1  is 


«nd  the  minimiser  strategy  is 


The  value  is 


G°(v|  z)  =  I  (v) 

-  y/Z  Zj 

w(z)  =  (zj  -  z2)2 


(6. 60) 


(6. 61) 


If  r^  is  limited,  the  results  are  obvious. 

Similarly,  if  z^  >  z^  and  z ^  <  0,  the  mapping  has  the  appear¬ 
ance  of  Figure  6-16. 

In  this  case,  the  line  of  interest  is  from  Xqj  to  1^  and  has 
equation 


8q  =  z2  +  z^  «  f  +  (2Z|  +  yfz  Zj  -  y/z  z2)(—p = 


(6.62) 

Zj  -  z2)  +  n/2(z2  -  Zj) 


In  the  region  of  interest,  the  slope  of  this  line  is  less  than  -1 
and  therefore  Iqj  is  the  preferred  strategy.  A  region  for  which 
tangency  is  possible  requires  Zj  >  z2,  which  violates  the  hypothesis 
for  the  region.  The  results  for  Case  3  are  summarized  in  Figure 
6-17. 

A  comment  on  the  nature  of  the  continuity  of  the  results  is 
perhaps  in  order.  Within  regions,  of  course,  continuity  is  obvious. 
At  boundaries  of  regions,  however,  the  continuity  is  not  always  so 
clear.  This  is  because  only  upper  semi-continuity  holds;  that  is, 
if  O  is  a  sufficiently  small  open  set  containing  the  set  of  optimal 
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Figure  6-17.  Strategies  and  values  for  Case  3 
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strategies  R°  at  a  point  z,  then  for  sufficiently  close  to  z,  the 
optimal  strategies  at  zf  are  contained  in  D.  However*  R°  may  not  be 
contained  in  the  set  of  optimal  strategies  of  z" .  The  meaning  of  this 
for  the  boundary  regions  is  that  strategies  there  are  typically  not 
unique.  Thus  solutions  on  opposite  sides  of  the  boundary  may  not 
be  near  each  other  although  both  are  near  some  optimal  strategy 
for  the  boundary  point. 

For  example*  consider  the  Region  A  boundary  =  - 

in  Figure  6-17.  The  situation  here  is  as  sketched  in  Figure  6-18. 
From  this  it  can  be  seen  that  any  strategy 

F(u)  =  (1  -  ft)  I0Q(u)  +  ft  I01(u),  ftctO,  z_  +  -^3  (6. 63) 


Ji 


1  t 


will  be  optimal  for  the  maximizer.  Strategies  on  both  sides  of  the 
line  Zj  =  -  are  continuous  with  this  strategy  for  some  a. 

Figures  6-19  and  6-20  are  sketches  of  the  results  given  in 
detail  in  Figures  6-12,  6-13,  and  6-17. 

6.4  LINEAR  PROGRAMMING  FOR  APPROXIMATE  SOLUTIONS 
Chapter  4  discussed  the  use  of  linear  programming  to  gen¬ 
erate  approximate  solutions  to  game  problems.  We  shall  see  some 
of  the  implications  of  the  technique  in  an  example.  Only  a  simple 
problem  evaluated  at  a  single  data  point  is  needed  to  clarify  the  ideas. 

Consider  the  game  of  Example  1,  Section  6. 1,  with  one  stage 
to  go  and  with  initial  condition  z^  =  0 ,  From  Equation  (6.  8)  we 
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Figure  6-19.  Optimal  strategy  regions  for  maximizing  player. 
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w(-l) 


min  max  „  J 
G(v)  F(u)  *  1 


V. 
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In  Section  6. 1  the  solution  was  found  to  be 


w(-l) =  £ 

^u^TVjIi  (6*65) 

G°(v)  =  1^ 


The  set  R  is  shown  in  Figure  6-1.  Let  us  approximate  it  by 
the  polygon  It  shown  in  Figure  6-21. 

To  lie  within  this  polygon,  x_  must  satisfy 


r2trl 


Vi'i 


>  3  1 

r2l?rl'I 

>5  3 

r2  ?  rl  ‘  I 


(6.  66) 
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The  polygon  is  internal  to  R  and  thus  our  solution  point  r_  of  the 
approximate  problem  will  be  a  viable  strategy  for  the  maximizer. 


Figure  6-21. 


Polygonal  approximation  to 
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Now  create  an  approximation  Pt  to  Pg  by  using  the  support 

A 

planes  generated  by  points  in  Cg  (Theorem  4.  6).  A  plane  will  have 
the  general  form  {sj  Sq  +  tSj  +  t  =  0,  t€[0,  l]}.  Let  us  choose 
t  =  0,  j,  1.  Also  note  that  we  are  interested  only  in 

s^  =  1  because  of  the  transformation  matrix  in  (6.  64).  Thus  we  say 
that  if  s€  Pjfr  then  Bq,  Sj  must  satisfy 


8o*° 

so  +  F  S1  *  “35 
8o  +  ?  81*  “T T 

.  2  >4 

80  +  5  8l  “ZT 

a  4-  5  .  a  25 

80  +  58l  *  W 

«  +  6  «  a  36 

80  +  T8l  W 


®0  +  8l  ^  ’  1 


(6.67) 


However,  after  using  the  usual  biasing  parameter  f,  we  find 
from  (6.  64)  that  8q,  Sj  must  also  satisfy 


J0  =  1  -  f  -  2rj  +  r2 


Sj  =  -  2  +  2r j 


(6.68) 


Substituting  this  in  (6.  67),  rewriting  (6.  66),  and  maximizing 
f,  we  find  that  we  have  the  following  linear  programming  problem: 
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maximize  f 
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(6.69) 


<\A 

For  this  problem,  the  solution  is  r°  =  r°  =  ^  and 

f°=|£.  Thus  w  =  |^  and  F°(u)  -  Ifftyu)  +  I^u).  Equality  of 

the  constraints  holds  in  the  first,  ninth,  and  tenth  of  (6.  69).  The 

2 

latter  two  correspond  to  the  hyperplanes  generated  by  t  =  and 

5 

t  =  -g.  It  should  be  noted  that  neither  of  the  latter  planes  is  a 
separating  hyperplane  of  and  the  mapping  of  R  (See  Figure  6-22), 
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Figure  6-22.  Polyhedra  at  optimum  payoff  point. 
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although  each  support*  Pg.  Either  (or  a  combination  of  both)  may 

be  used  as  an  approximate  strategy  for  the  minimizer,  since  it  is 

known  that  pure  strategies  are  sufficient  for  him. 

If  another  iteration  is  used,  with  the  R  approximation  being 

2  13 

the  same  but  with  Pg  approximated  using  t  -  0,  -g,  j,  1  (so  that 

a  smaller  granularity  appears  in  the  region  of  the  possible  solution 

t  =  t  =  -|  fro m  the  first  iteration),  it  is  found  that  w  =  f°  =  — 

and  that  both  r®  *  r®  *  *  and  r®  *  r|  *  ^  yield  this  value 

(as  will  r®  *  r^,  r®  t C^p,  Support  planes  t  =  t  =  ^  give 

1  3 

the  latter  jr  values  and  t  =  -g  give  the  former.  In  this  case 

t  s  j  is  a  separating  hyperplane  and  I^(v)  is  a  good  strategy  for  the 

minimizer.  Either  or  both  of  the  r-moments  may  be  used  by  the 

maximizer  with  justification;  one  suitable  c.d.  f.  is  F°(u)  - 
9  11 

ij;j  1q(u)  +  Ij(u).  Closer  approximations  achieved  by  smaller 
granularity  are  of  course  possible. 
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COMMENTS  ON  DUAL  CONES  FOR  DIFFERENTIAL  GAMES 
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Two-person  zero-sum  differential  games  with  closed-loop 
strategies  have  been  the  subject  of  considerable  research  interest, 
and  we  would  be  remiss  if  we  did  not  consider  extending  our  results 
to  such  games.  We  shall  find  that  this  extension  seems  fraught  with 
peril,  however,  and  therefore  confine  ourselves  to  comments  and 
to  formal  arguments.  Open-loop  strategies  are  somewhat  simpler, 
but  many  of  the  same  comments  apply. 

7.1  THE  PROBLEM  OF  DIFFERENTIAL  GAMES 

The  differential  game  analog  of  our  multistage  games  has 
dynamics 

i(t)  =!(z(t),  u(t),  v(t),  t)  (7.1) 

and  payoff  function 

T 

J(z(T);  u(t),  v(t);  T,  r)  =  gf(z(T))  + J " g(z(t),  u(t),  v(t),  t)  dt  (7.2) 

T 

where  z jj)  is  an  initial  condition  given  at  time  r  for  the  dynamics 
equation  (7. 1),  and  u  and  v  are  control  vectors.  In  the  research 
to  date  (See  Chapter  2),  the  functions  J;,  g^,  g  are  usually  such  that 
pure  optimal  stiategy  functions  u°(t)  and  v°(t)  exist,  and  the  object 
has  been  to  determine  these  functions  and  the  value  function 
w(z(r),  T,  r) 
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w(*(r),  T,  r)  =  v*i  u(t),  v(t);  V,  r)  (7,  s> 

(u(t),  v(t»  “  “  “ 

In  some  cases  it  has  even  been  possible  to  find  optimal  closed-loop, 
or  feedback,  strategies  so  that  u*  J(t)  =  u°U(t),  t)  and  v°{t)  = 
v°(x(t),  t).  The  usual  technique  has  been  to  apply  either  a  method 
of  characteristics  or  a  Hamilton* Jacobi- Bellman  method.  The  latter 
method  requires  the  solution  of 

-  *s=-  w(s(T),  T,  T)  *  val  (gU(T),  u(r),-v(T),  T) 

(u(r),  v(t )  “ 

♦  T,  t)M(s<t >,  u(r).  v(T),  r )) 

When  pure  strategy  solutions  do  not  -exist,  the  problem  be¬ 
comes  more  difficult.  For  differential  games  even  the  precise 
definition  of  what  is  meant  by  a  mixed  strategy  can  be  elusive, 
although  it  will  in  some  sense  be  a  cumulative  probability  distribution 
F(u(t))  Lor  G(v(t))3  over  all  admissible  control  functions  aft) 

Lor  v(t)3.  We  might  think  of  a  closed-loop  mixed  strategy  for  the 
maximiser  as  a  c.d.f.  F(uj>:(T),  r),  with  a  similar  function 
G(v|*(f),  r)  for  the  minimiser,  and  then  choose  the  control  vectors 
of  each  time  instant  r  by  making  random  draws  from  the  proper 
distribution. 

Defining  these  concepts  precisely  and  computing  the  optimal 
strategies  rs  rife  with  philosophical  and  mathernati :al  difficulties. 

The  obvious  step  of  applying  the  method  of  dual  cones  to  the 
pre-Htmiltoniau  on  the  right-hand- si'1  a  of  {7.  4)  it  not  really  obvious 
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in  implementation  and,  as  we  shall  see,  does  not  even  seem  to 
necessarily  lead  to  definitive  results.  An  intuitively  acceptable 
approach  is  to  discretize  the  differential  game  by  taking  a  partition  II 
of  the  time  interval  [f ,  T]  and  to  agree  to  let  the  controls  u  and  v 
be  constants  within  an  interval  (t.,  t.+j)  of  the  partition.  The 
resulting  multistage  game  is  solvable,  at  least  in  principle,  and 
its  value  Wjj(z(r),  T,  r)  and  mixed  strategies  for  each  interval  may 
be  found.  We  then  accept  the  limit  w *(z(T),  T,  T)  of  wjj (z(r),  T,  T) 
as  the  size  |Il|  of  the  partition  II  goes  to  zero  as  the  value  of  the 
differential  game,  provided  that  the  limit  exists,  and  similarly  take 
the  optimal  mixed  strategy  limits  as  suitable  for  the  differential 
game. 

Fleming  [55]  shows  that  if_f  and  g  arc  continuous  and  satisfy 
a  Lipschitz  condition  in  £  and  if  satisfies  a  Lipschitz  condition  on 
every  bounded  set,  then  the  limit  w*  exists;  he  conjectures  that  w* 
is  indeed  the  value  of  the  differential  game.  In  a  more  restrictive 
theorem,  but  one  applicable  for  our  problem,  Fleming  [53]  proves 
that  if  a  function  w(z(7),  T,  r)  satisfies  (7.4)  and  is  continuously 
differentiable  in  an  open  set  containing  the  region  o«  interest,  then 

(a)  w(z(r),  i,  r)  -  |npo  wn(£(r)'  T’  T)  uni*ormly 

(7.5) 

(b)  w(z(7),  T,  r)  is  the  value  of  the  differential  game 
with  initial  condition  z(T)  at  time  T  and  fixed 
terminal  t*me  T. 

The  latter  statement  holds  in  the  sense  of  {-effective 
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closed-loop  strategies,  that  is,  strategies  which  are  arbitrarily 
close  discrete  approximations  of  continuous  strategies. 

Given  this  exceedingly  brief  background,  let  us  first  solve 
a  simple  example  using  limits  of  discrete  approximations  and  then 
consider  the  question  of  direct  evaluation  of  (7. 4)  for  that  example. 
7.2  A  FORMAL  EXAMPLE 

A  very  simple  example  will  help  illustrate  some  of  the  points 
to  be  made.  Let  the  dynamics  equation  be 

z  -  u  +  v  z(0)  =  Zq  (7. 6) 

where  z,  ue  [0,  l],  vc[0,  l]  are  scalars,  and  let  a  payoff  function 
be  given  as 

J(z(T),  u.  v.  T)  =  (z(T))2  (7.  7) 

We  seek  the  value  and  optimal  closed-loop  mixed  strategies  for 
this  game. 

If  (7.  6)  is  approximated  by 


*i+i  *  *i +  *  <ui +  vi> 


(7.8) 


where  C  =  (T  -  r)/N,  TC  [0,  T),  then  we  find  that  we  have  a  game 
which  is  of  the  type  considered  in  previous  chapters.  In  fact,  since 


wN+,W  =  •  , 


WN<ZN»  2  (v'vN)  Cl  “N 
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#  2 

Letting  w^(z)  -  w^{z)/c  and  x  =  z/€  gives 


WN(X|  *  !u?‘v)  Cl  u 
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which  is  precisely  the  same  as  the  intermediate  problem  of  Example 
6. 1.  When  we  use  the  results  of  that  example;  w«  find 


w^i+1(x)  =  max  [(x  +  i}2. 


F^[ulx)  = 


v*> 

k  l0(u|  +  T1!  <“) 


Ij(u) 


„  •  1 

x  1  "  2 


-i-£*x*-i  +  I 


.  *  1 

x> "  1  +  I 


(7,11) 


Gjvjx)  = 


Ijtv) 


•i+$-x' 


(v) 


.  •  1 

XCI-J 

"i“T*x*-i+y 


x>+7-i 


This  may  also  be  written  in  terms  of  w  and  z  as 


* 


wN-i+l(z)  =  max  ^  +  2i€+  ^  *Z* 


F°(u|z)  = 


G?(v|  z)  = 


*0^ 

Z<(-i-I)€ 

j  x0{u) + 

1  +  1>C 

Ii(u) 

a  >  (-  i  +  «)  € 

(7.12) 

Ij(v) 

z  <  (-  i  -  j)  € 

i  +  ^-)  € 

i0(v) 

z  >  (-  i  +  j)  € 

Taking  c  =  (T  -  T)/N,  holding  T  and  T  fixed,  and  letting  N  -*00,  gives 
formally,  for  i  =  N 


w(z,  T,  T)  =  (z  +  (T  -  T))1 
il0(u(f)) 


IjNr)) 


I,(v(T)) 


g°(v(t)| z(r),  T)  =  I^(v(r» 
'  I0(v(T)) 


z(T)  <  -  T  +  T 


F°(u(r)U(T),  t)  =  J  l0(u(r))  +|l1(u(r))  z(t)  =  -  T  +  r 


z(T)  >  -  T  +  T 

(7.13) 

z(t)  <  -  t  +  r 
z(r)  =  -  T  +  t 
z(r)  >  -  T  +  r 
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This  gives  the  value  of  the  game  starting  at  time  7=0  and  position 

2 

Zq  as  w(Zq,  T,  0)  =  (z^+T)  ,  and  yields  optimal  closed-loop  strate¬ 
gies  for  the  players  for  each  7  €  [0,  T). 

Substituting  in  (7.  4),  we  find  that  for  each  7 


2(z  +  T  -  7)  =  (^aly)  [2(z  +  T  -  7 )(u  +  v)] 


1  1 

=ff  2(z  +  T  -  7){u  +  v)  dF°(u|  z)  dG°(v|  z) 


(7.14) 


‘  0 


5  2(z  +  T  -  7) 

Therefore,  by  Fleming's  results  [53]  we  indeed  have  a  solution  to 
the  problem. 


7. 


SOLUTIONS  USING  LIMITS  OF  DISCRETE  APPROXIMATIONS 


The  example  in  Section  7.  2  is  provocative  in  that  it  ltads  us 
to  conjecture  as  to  which  differential  game  problems  may  be  solved 
in  that  same  manner.  Solving  the  problems  exactly  appears  to 
require  that  the  discrete  approximations  be  analytically  solvable 
using  the  partition  size  as  a  parameter,  which  in  turn  seems  to  mean 
that  the  discrete  problems  must  be  such  that  the  value  for  each  stage 
is  a  polynomial  and  the  stage  patterns  are  repetitive  so  that  induction 
on  the  stage  index  is  possible.  These  are  clearly  restrictive  assump¬ 
tions. 

If  only  approximate  solutions  are  sought  or  if  the  problem 


is  such  that  limit  patterns  are  easily  recognizable,  then  a  much 
broader  spectrum  of  problems  may  be  attacked.  In  principle, 


if  _f,  g^,  and  g  in  (7. 1)  and  (7. 2)  are  polynomials,  then  the  method 
of  dual  cones  may  be  applied  to  any  discrete  approximation  to  the 
differential  game  and  the  results  of  Chapter  5  may  be  applied. 

More  particularly,  this  may  be  done  for  a  sequence  {11^,  FI^, . . . ,  11^  } 
of  partitions  of  the  time  interval  [0,  T],  with  |ll.+j  j  <  |  II.  | .  This  will 
yield  sequences  of  value  functions  {wjj  (Zq,  T,  0)}  and  of  corre¬ 
sponding  mixed  strategies,  and  an  approximate  solution  to  the 
differential  game  may  be  taken  either  as  one  of  the  discrete  versions 
or  as  a  "guessed"  limit  of  the  sequence. 

There  are  two  important  difficulties  with  the  approximate 
approach.  First,  the  value  function  may  not  be  a  polynomial  in  the 
region  of  interest,  so  that  further  approximations  are  necessary. 

We  remark  that,  as  shown  in  Chapter  5,  this  is  not  a  problem  if 
open-loop  strategies  are  sought.  The  second  difficulty  is  one  of 
dimensionality,  for  if  |IL  |  is  small,  then  a  great  many  subintervals 
will  require  processing.  This  may  overburden  a  digital  computer 
regardless  of  whether  open-loop  or  closed-loop  strategies  are 
sought. 


7.4  SOLUTIONS  BY  ANALYSIS  OF  THE  P RE -HAMILTONIAN 

It  is  tempting  to  try  to  solve  (7.  4)  directly,  without  resorting 
to  limiting  operations.  Unfortunately,  it  is  necessary  to  be  very 
careful  while  doing  this  for  it  amounts  to  operating  "at  the  limit" 
in  situations  where  the  higher  order  terms  may  be  essential. 

To  illustrate  this,  let  us  first  return  to  our  example.  In 
particular,  suppose  that  the  value  is  known  to  us  but  we  are  seeking 
the  optimal  strategies.  Then  we  seek  distributions  such  that 
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2(z  +  T  -  T)  =  (u^j  [2(2  +  T  -  T)(u  +  v)] 


1  1 


(7.15) 


-  T-//^--Hu+v)dF(uU.  r}  dG(v|z,  r) 


0  0 


The  optimal  distributions  are  obviously  those  of  (7. 13)  provided  that 
(z  +  T-  T)  ^  0.  However,  if  (z  +  T-  T)  =  0,  then  (7. 15)  does  not  yield 
information  concerning  the  strategies.  Thus  there  are  both  philo¬ 
sophical  and  practical  difficulties  in  attacking  the  pre- Hamiltonian. 

The  reason  for  the  difficulty  with  the  above  example  is  easy 
to  find, for  (7.  4)  is  a  limit  of  the  discrete  form 


w(z,  T,  T)  -  w(z,  T,  T  +  €)  ,  T 

— - -  =  (u  V)  [g(*  *  *  r)  +  I 


x  ,  .T  32w  ,  ,  -l 

+  C  f  - f  +  .  .  .  J 

~  ~ 


(7.16) 


Ordinarily  *.he  terms  on  the  r.  h.  s.  containing  €  are  ignored,  for  it 
is  claimed  that  they  are  dominated  by  the  first  two  terms.  However, 
in  our  example  this  is  not  the  case. 

More  generally,  in  solving  discrete  approximations  using  the 
principle  of  optimality  we  deal  with  equations  of  the  form 


wn(z,  T»  7)  =  (^alv)  g(*»  v,  r) 


(7.17) 


+  Wj]U  +  C  _f  (z,  U,  v»  T ),  T,  T  +  €  )  ] 
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In  applying  the  method  of  dual  cones  to  (7. 17),  z  and  €  are  simply 
parameters  in  the  solution.  We  have  already  seen  that  as  the 
parameter  jz  varies,  the  set  S(A(z),  R,  a)  moves  relative  to  the 
duaJ  cone  and  may  possibly  come  to  or  cross  a  boundary  from 
one  form  of  strategy  to  another.  This  is  particularly  likely  if  a 
coefficient  within  A(:z)  passes  through  zero.  Since  €  may  well 
appear  in  (7. 17)  in  such  a  manner  that  a  coefficient  in  A(z)  will 
be  zeroed  if  «  =  0,  it  is  likely  the  problem  for  c  =  0  will  be  different 
in  nature  from  the  problem  for  e  >  0.  It  seems,  therefore,  that 
equation  (7.  4)  is  useful  for  sufficiency  checks  on  candidate  solutions 
but  is  of  limited  value  for  synthesis  purposes. 


CHAPTER  8 
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SUMMA.IY,  CONCLUSIONS,  AND  FUTURE  WORK 

In  this  report  a  viable  solution  technique  for  a  special  class 
of  dynamic  games  has  been  created.  The  necessarily  theoretical 
flavor  of  the  approach  must  not  be  allowed  to  obscure  the  following 
fundamental  result: 

Two-person  zero-sum  noise-free  multistage  polynomial 
games  of  fixed  duration  may  always  be  reduced  to 
separable  static  games  if  open-loop  mixed  strategies 
are  sought,  and  may  often  be  reduced  to  sequence  of 
such  games  when  closed-loop  mixed  strategies  are 
desired.  The  separable  static  games  may  then  be 
solved  t  s  mathematical  programming  problems. 

Of  particular  significance  in  applications  is  the  vact  that  the  tech¬ 
nique  is  amenable  to  straightforward  intuitively -satisfying  numerical 
approximation;  in  fact,  the  well-developed  methods  :.nd  algorithms  of 
linear  programming  may  be  used.  These  results  were  obtained  and 
extensively  discussed  in  Chapters  4  and  5,  and  they  were  illustrateu 
in  the  examples  in  Chapter  6. 

The  method  of  dual  cones,  then,  has  been  extended  to  the 
point  that  it  may  now  be  effectively  applied  to  some  real  problems. 
Nevertheless,  much  work  remains  to  be  none.  Numerical  approxi¬ 
mations  should  receive  detailed  attention  in  order  that  solutions 
may  be  obtained  efficiently  and  precisely,  and  nonlinear  pro¬ 
gramming  formulations  should  be  investigated.  The  form  of  the 
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value  function  must  be  investigated  further;  both  theoretical 
questions  of  algebraic  form  and  practical  questions  of  numerical 
approximation  require  answers.  The  convex  sets  involved  in  vector 
problems  need  analytical  description,  if  possible.  These  and  related 
questions  should  be  the  subjects  of  immediate  research. 

Broader  extensions  of  the  method  of  dual  cones  may  also  be 
possible.  The  need  for  further  investigation  of  its  relationship  to 
differential  games  is  obvious.  For  example,  an  interpretation  in 


other  as  time  varies,  with  the  direction  of  motion  depending  on  the 
dynamics  of  the  game,  can  be  visualized.  Some  of  the  questions 
raised  in  Chapter  7  also  bear  answering. 

Research  should  also  be  performed  on  the  extension  of  the 
method  to  stochastic  games.  Several  approaches  appear  possible 
here.  One  of  the  most  intriguing  possibilities  is  to  note  that 
imperfect  knowledge  of  the  state  may  mean  that  the  set  S(A,  R,  a) 
is  "fuzzy."  Using  this  picture,  it  may  then  be  possible  to  find  not 
only  a  value  but  the  distribution  of  the  payoff. 

Less  obvious  possible  extensions  undoubtedly  exist,  for 
mathematical  game  theory  is  an  extensive  field  with  many 
unsolved  problems. 
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